Augmented knowledge base and reasoning with uncertainties and/or incompleteness

ABSTRACT

A knowledge-based system under uncertainties and/or incompleteness, referred to as augmented knowledge base (AKB) is provided, including constructing, reasoning, analyzing and applying AKBs by creating objects in the form E→A, where A is a rule in a knowledgebase and E is a set of evidences that supports the rule A. A reasoning scheme under uncertainties and/or incompleteness is provided as augmented reasoning (AR).

CROSS-REFERENCE TO RELATED APPLICATIONS

This application is a continuation application of U.S. patentapplication Ser. No. 17/127,030, filed on Dec. 18, 2020, which is acontinuation application of U.S. patent application Ser. No. 15/587,917,filed on May 5, 2017, now U.S. Pat. No. 10,878,326, which is acontinuation application of U.S. patent application Ser. No. 14/994,635,filed on Jan. 13, 2016, now U.S. Pat. No. 9,679,251, which is acontinuation of U.S. patent application Ser. No. 13/836,637, filed Mar.15, 2013, now U.S. Pat. No. 9,275,333, and which is based upon andclaims priority to prior US Provisional Patent Application No.61/645,241 filed on May 10, 2012 in the US Patent and Trademark Office,the entire contents of which are incorporated herein by reference.

FIELD

The embodiments of the present invention relate to a computerimplemented knowledge-based system and reasoning with uncertaintiesand/or incompleteness.

BACKGROUND

Reasoning with uncertainties and/or incompleteness refers to the variousprocesses leading from evidences or clues to conclusions or guessesusing uncertain, vague, partial, incomplete and/or limited information.Reasoning with uncertainties mostly refers to information which areuncertain, vague and/or inexact; while reasoning with incompletenessrefers to information which are incomplete, partial and/or limited. Aknowledge-based system under uncertainties and/or incompleteness is aknowledge base where reasoning with uncertainties and/or incompletenessare involved.

Knowledge-based systems involving uncertainties and/or incompletenesshave been studied widely in the literature. Many approaches have beenintroduced to model such knowledge bases, but none are satisfactory ingeneral.

The basic building blocks of knowledge bases are knowledge, which areusually represented as rules, propositions, or other equivalent means.

Moreover, in most traditional knowledge bases with uncertainties and/orincompleteness, each piece of knowledge in the knowledge base isassociated with or mapped to a number, variably referred to as belief,certainty factor, likelihood, weight, etc.

To perform reasoning/inferences, either an extension scheme for themapping mentioned above, or a conditioning/composition rule must bespecified.

SUMMARY

According to embodiments of the invention, a new knowledge-based systemunder uncertainties and/or incompleteness, referred to as augmentedknowledge base (AKB) is provided; and methods for constructing,reasoning, analyzing and applying AKBs are provided. In addition, a newmethod for reasoning under uncertainties and/or incompleteness isprovided. This method will be referred to as augmented reasoning (AR).Advantages and powers of are described herein.

According to an aspect of an embodiment, a method and apparatus areprovided for representing a knowledge base by creating objects in theform E→A, where A is a rule in the knowledge base, and E is a set ofevidences that supports the rule A; and determining, by deductivereasoning using the knowledge in the knowledge base and/or inducting orextracting new knowledge from the knowledge base; and constructinghigher order knowledge bases to insure that the knowledge base isconsistent. Such knowledge base will be referred to as augmentedknowledge base. For example, a composite set of evidences G is computed,the composite set of evidences G is a combination of sets of evidences Ein the knowledge base in support of a target rule L that is implied by acombination of rules in the knowledge base.

These and other embodiments, together with other aspects and advantageswhich will be subsequently apparent, reside in the details ofconstruction and operation as more fully hereinafter described andclaimed, reference being had to the accompanying drawings forming a parthereof, wherein like numerals refer to like parts throughout.

BRIEF DESCRIPTION OF DRAWINGS

FIG. 1 is a functional block diagram of an augmented knowledge base(AKB) computer system, according to an embodiment of the presentinvention.

FIG. 2 is an example data structure of an AKB with example data entries,according to an embodiment of the present invention.

FIG. 3 provides the basic mechanism for deductive reasoning on an AKB,according to an embodiment of the present invention.

FIG. 4 is a flow diagram of an example unification algorithm, accordingto an aspect of an embodiment of the invention.

FIG. 5 provides types of measures for representing the strength of abody of evidences, according to an embodiment of the present invention.

FIG. 6 is a flow diagram of deductive inference in AKB, according to anembodiment of the present invention.

FIG. 7 is a table of methods for testing and ensuring properties of AKB,according to an embodiment of the present invention.

FIG. 8 is a flow diagram of inductive inference in AKB, according to anembodiment of the present invention.

FIG. 9 is a flow diagram of constructing consistent higher order AKBs,according to an embodiment of the present invention.

FIG. 10 is a diagram for illustrating the relations for generating afree-form database, augmented relational database, augmented deductivedatabase and augmented inductive database, according to an embodiment ofthe present invention.

FIG. 11 is a flow diagram for generating graph representations ofconstraints for bodies of evidences, according to an embodiment of thepresent invention.

FIG. 12 is a flow diagram of checking for admissibility, according to anembodiment of the present invention.

FIG. 13 is a flow diagram of checking for compatibility between themeasure and the collection of constraints, according to an embodiment ofthe present invention.

FIG. 14 is a flow diagram of guaranteeing compatibility, according to anembodiment of the present invention.

FIG. 15 is a flow diagram of preprocessing, according to an embodimentof the present invention.

FIG. 16 is a functional block diagram of a computer, which is a machine,for implementing embodiments of the present invention.

DETAILED DESCRIPTION OF EMBODIMENT(S)

Section 1. Augmented Knowledge Base

According to an aspect of an embodiment, a knowledge based system refersto a system comprising a knowledge base for representing knowledge,acquisition mechanisms and inference mechanisms, an augmented knowledgebase κ includes one or more objects of the form E→A, where A is alogical sentence in a first-order logic, a proposition and/or a rule(referred to collectively herein as ‘a rule’) in the traditional sense;and E is a set of evidences (a plurality of evidences) as a body ofevidences that supports A. Moreover, each given body of evidences (noteach rule as in a typical knowledge base) is mapped to or associatedwith a value.

Since bodies of evidences are sets, in an aspect of an embodiment, therelations among the bodies of evidences are considered. The ability ofAKBs to deal with relationships among the bodies of evidences, or itscapability to take into account constraints imposed on the bodies ofevidences, is a unique feature of AKB, which can lead to more powerful,robust, accurate and/or refined characterizations of the knowledge base.

In an embodiment of the present invention, inference or reasoning in anAKB is done in two separate phases: Form the body of evidences thatsupports the target rule, and then determine the value associated withthe resulting body of evidences. The second phase can be done in manydifferent ways, including the reasoning scheme AR based on constraintstochastic independence method, which is part of this invention. ARprovides a clear probabilistic semantics, resulting in the eliminationof virtually all known anomalies associated with existing formalisms ofuncertain reasoning and knowledge bases.

In addition, in an aspect of an embodiment, we provide methods fortesting and ensuring certain properties of AKBs, including consistency,completeness, monotonicity, contribution, vulnerability, deception, etc.

Moreover, in an aspect of an embodiment, we show how AKBs can be usedfor inductive inference. Indeed, we show that inductive inference can beused to extract meaningful new knowledge from AKBs provided certainconsistency conditions are met.

Furthermore, in an aspect of an embodiment, we provide methods toguarantee (also referred to herein as a substantial guarantee) arequired (or specified) consistency via higher order AKBs.

Lastly, in an aspect of an embodiment, we show how AKBs can be used tobuild free-form database (FFDB). We then show how FFDB can be used toformulate augmented relational databases, augmented deductive databasesand augmented inductive databases. The latter can be used for extractingnew information from relational databases (e.g. data mining).

In an embodiment of the present invention, a method for solving thefollowing aged-old problem is provided: Given two arbitrary sets A andB, if the probabilities, p(A) and p(B), of both sets are known, what isthe probability of their intersection, i.e., p(A∩B)? This problem couldbe trivial if the probability is defined over the entire probabilityspace. However, in many important applications, such as reasoning withuncertainty, which is one of the main concerns of this invention, theprobabilities of only a few sets are given and there is a need todetermine the probabilities of other sets formed from these sets, i.e.,extending the probability measure from a given collection of sets to alarger collection of sets formed from the given sets. Knowing how tocompute p(A∩B) from p(A) and p(B) provides the cornerstone for theextension schemes.

The extension scheme given in an embodiment of the present inventionuses an average method to determine the probability of the new set. Ithas a clear probabilistic semantics, and therefore not subject to theanomalies that are usually associated with other extension schemes.

Moreover, in an aspect of an embodiment, the present invention iscapable of dealing with constraints imposed on the sets that are given.

Furthermore, in an aspect of an embodiment, necessary and sufficientconditions for the extension to be well-defined, as well as, necessaryand sufficient conditions for compatibility are provided.

Lastly, in an embodiment of the present invention, a preprocessingalgorithm is provided so that the extension can be computed moreefficiently.

Let κ be an AKB. The A in (E→A) is a rule in κ, which can be a logicalsentence in a first-order logic, a proposition and/or a rule in atraditional knowledge base. In this case, we say that Aϵ

, i.e., A is a member of

, where

is the collection of all knowledge of interest. On the other hand, the Ein (E→A) is a set, representing a body of evidences, and is not a rule.In this case, we say that E⊆U, i.e., E is a subset of U, where U is auniversal set of evidences. Moreover, the (E→A) is not a rule either. Itspecifies that the body of evidences E supports the rule A. Moreprecisely, if E is established or true, then A is true. This signifiesthat the full force of E devolves on A. Associating a body of evidences(i.e., a set of evidences) with a rule is a novel feature of thisinvention and is a central characteristic of augmented knowledge bases.The bodies of evidences supporting a rule can be obtained from manydifferent sources (including expert opinions, scientific literature,news articles, blogs, etc.) and can be determined either subjectively orobjectively.

In the rest of this document, ⊆ will always denote ‘is a subset of’,while ϵ will denote ‘is a member of’ or ‘belonging to’.

In most traditional knowledge-based systems that deal with uncertaintiesand/or incompleteness, rules are associated with numbers. These numbersare variably referred to as beliefs, certainty factors, likelihood,weight, etc. On the other hand, as illustrated in FIG. 1 , in an AKBcomputer system 100, each rule A in an AKB is associated with a ‘set’E—the body of evidences (plural or two or more evidences) that supportsthe rule. In addition, in contrast to typical knowledge bases, accordingto an embodiment of, a set of evidences E as a given body of evidences(not each rule) is mapped to or associated with a value (e.g., weight)(see also FIG. 2 ). These are several unique features of AKB. In FIG. 1, a computer 120 is configured to provide the described augmentedknowledge base functions, including storing an augmented knowledge base130. Experts and other sources 110 refer to one or more processingdevices (e.g., computer, mobile device, mobile phone, etc.) as sourcesof evidences (E) and/or rules (A) to the AKB computer120 over one ormore data communication technologies, such as Internet, wire, wireless,cellular phone network, etc. The form of information of the experts andother sources 110 can be a database, blog, chat rooms data, socialnetwork website databases, other knowledge bases, etc. The FIG. 2 is anexample data structure of an augmented knowledge base (AKB) 150 withexample data entries for the AKB, according to an embodiment of thepresent invention. The embodiments utilize entries of the AKB, such asFIG. 2 , to further augment a knowledgebase of rules in

154 by determining (outputting or computing) support for a desired(target) rule L, namely an unknown rule that would like to beestablished. According to an aspect of an embodiment a target rule L canbe added to the rules in

154 in the AKB 150.

In FIG. 2 , for example, 154 is

, which is the collection of all knowledge in form of rules A 155 ofinterest. The body of evidences 160 (E) support the respective rules 155(A) (i.e., (E→A)). In FIG. 2 , the body of evidences 160 (E) is mappedto a number 160, variably referred to as belief, certainty factor,likelihood, weight, etc. Sources 110 for the evidences are alsoprovided. The constraints 165 are relationships among the bodies ofevidences Es 156 (the capability to take into account constraints 165imposed on the bodies of evidences Es 156), which is a feature of AKB toderive more powerful, robust, accurate and/or refined characterizationsof the knowledgebase rules

. For example, in FIG. 2 , E106 is a subset of E104, since the pH levelin E106 contains that of E104.

An object generated includes a set of evidences E156 and the rule A156supported by E156. According to an aspect of an embodiment, an object inthe form of E→A may be implemented as a data object (a data structure, aclass) for electronically storing a set of evidences E156 and a ruleA156 supported by E156. The data object may be associable or includelinks to (a) value (e.g., weight) for E160, (b) sources in support ofE110, and (c) constraints for E165, as discussed in more detail herein.

An apparatus is provided including a computer readable storage mediumconfigured to support managing (by way of storing) objects in the formE→A, where A is a rule in the knowledge base, and E is a set of evidencethat supports the rule A; and a hardware processor (for example,computer processor and/or circuitry) to execute computing a compositeset of evidences G, the composite set of evidences G being a combinationof sets of evidences E in the knowledge base in support of a target ruleL implied by a combination of rules in the knowledge base. E and A areelectronically represented in computer readable media as data values(identifiers), including other values associated with E and/or A, suchas weight values assigned to E, for processing, for example, generation(formation), computations or determinations of set(s) of data values andrelations among the sets to deduce and/or induce target rule L as wellas indicate validity or strength of target rule L.

Since bodies of evidences are sets, AKB could explicitly consider therelations among the bodies of evidences. The ability of AKBs to dealexplicitly with relationships among the bodies of evidences, or itscapability to take into account constraints imposed on the bodies ofevidences, is another unique feature of AKB, which can lead to morepowerful, robust, accurate and/or refined characterizations of theknowledge base.

In an AKB, evidences (the Es) and rules (the As) can representcompletely different objects and each play an essential but separaterole in the operations of an AKB. The dichotomy between rules andevidences is a novel approach, and is one of the critical differencesbetween AKB and all the known models of knowledge bases. The power andflexibility offered by such a separation is yet another unique featureof AKB.

In an AKB, for example, no conditions are imposed on either E or A.Therefore, for example, it is further possible to have both E₁→A andE₂→A in the AKB. This means two or more sets (a set being two or moreevidences Es also referred to as ‘body of evidences’) of evidences cansupport the same rule, thereby allowing, among other things, multipleexperts/sources to be involved in designing a single AKB. Since typicalknowledge bases do not associate evidences with rules, they would not bereadily capable of dealing with multiple experts/sources forconsolidating rules from different experts/sources. On the other hand,AKBs allow experts/sources to interact freely and completely, ifnecessary. AKBs could even monitor the experts/sources to determinetheir reliabilities and possible deceptions (see Sections 4). This isanother unique feature of AKB.

Let κ be an AKB. The extension of κ, denoted by {circumflex over (κ)},is defined recursively as follows:

-   -   1. (Ø→_(κ)F), (U→_(κ)T)ϵ{circumflex over (κ)}, where T and F        represent the logical constants TRUE and FALSE, respectively.    -   2. if ω=(E→L)ϵκ then (E→_(κ)L)ϵ{circumflex over (κ)}.    -   3. if ω₁=(E₁→_(κ)L₁), ω₂=(E₂→_(κ)L₂)ϵ{circumflex over (κ)} then        ω₁∨ω₂=((E₁∪E₂)→_(κ)(L₁∨L₂))ϵ{circumflex over (κ)} and        ω₁∧ω₂=((E₁∩E₂)→_(κ)(L₁∧L₂))ϵ{circumflex over (κ)}.        {circumflex over (κ)} extends κ so that AKBs can deal with        composite objects co, associated with combinations of sets of        evidences E in the knowledge base and combinations of rules in        the knowledge base. Therefore, the embodiments utilize both        composite evidences and composite rules to establish support for        a target rule.

In the rest of this document, u and n denote ‘set union’ and ‘setintersection’, respectively, while ∨ and ∧ denote ‘logical or’ and‘logical and’, respectively.

Let κ be an AKB.

If ω=(α→_(κ)β)ϵ{circumflex over (κ)}, then l (ω)=α and τ(ω)=β.

ωis a composite object, l (ω) denotes a composite set of evidencesassociated with ω, and τ(ω) denotes the composite rules associated withω.This enable {circumflex over (κ)} to extract the rule portion and theassociated set of evidences portion from ω, where ωrepresents acomposite object of a plurality of E→A.

Let κ be an AKB, G⊆U (G is a body of evidences from universal set ofevidences U) and Lϵ→

, where

is the collection of knowledge or rules (given or otherwise) which weare interested in. In FIG. 3 , the basic operation at 210 for deductivereasoning is

${G{\overset{d}{arrow}}_{\kappa}L},$where G is a composite set of evidences and L is a target rule. A targetrule L refers to a rule in

or an unknown rule that would like to be established. More detailexplanations can be found throughout this section.

$G{\overset{d}{arrow}}_{\kappa}L$where, d denotes deductive reasoning, if and only if there existsωϵ{circumflex over (κ)} such that l (ω)=G and r(ω)⇒L.l(ω)=G means that G is the composite set of evidences that supportsr(ω), ω being a composite object of a plurality of E→A, and r( ) L meansthat the if the composite rule r(ω) is true, then the target rule L istrue. In other words, ⇒ stands for ‘logical implication.’

$G{\overset{d}{arrow}}_{\kappa}L$is one of the key concepts for dealing with deductive reasoning andother aspects of AKBs. Among other things, it provides a natural way toshow how a given Lϵ

is deductively related to κ. (See also FIG. 3 ). To put it in anotherway,

$G{\overset{d}{arrow}}_{\kappa}L$means there exists a composite object ω in the AKB κ (ω is formed fromobjects of a plurality of E→A, in κ), where G is the composite set ofevidences associated with ω, and the composite rule associated with ωimplies (inferred by) L.

As an example, consider the partial AKB given in FIG. 2 . Assume thetarget rule L is:

-   -   Velvet disease is TRUE or the fish color is yellowish.

To determine G, we will use the following objects in the AKB:

-   -   1. (E101→A101) where A101 is the rule ‘water pH level less than        6.0’ and the weight of E101 is 0.0014.    -   2. (E102→A102) where A102 is the rule ‘water pH level is in        [6.0, 7.0), i.e., the water pH level is greater than or equal to        6.0 but less than 7.0’ and the weight of E102 is 0.1930.    -   3. (E121→A121) where A121 is the rule ‘water condition is acidic        when water pH level is less that 6.0 or (6.0, 7.0)’ and weight        of E121 is 1.0000.    -   4. (E133→A133) where A133 is the rule ‘fish color is        discolored—yellowish when water condition is acidic and velvet        disease is false and weight of E133 is 0.3028.

Since L=(A101∨ A102) ∧ A121∧ A133, therefore

$G{\overset{d}{arrow}}_{\kappa}L$where G=(E101∪E102) ∩E121∩E133. It is possible to have many other Gwhere

$G{\overset{d}{arrow}}_{\kappa}{L.}$. For example, it is not necessary to use both objects 1 and 2.

$G{\overset{d}{arrow}}_{\kappa}L$holds by using only one or the other alone. Following the line ofreasoning given above, if we observed that ‘the fish color is notyellowish’, then we can show that any G, where

${G{\overset{d}{arrow}}_{\kappa}L},$is a set of evidences that supports the target rule ‘velvet disease isTRUE’.

Let κ be an AKB and Lϵ

.

-   -   At 220, the “support” of a target rule L with respect to the        augmented knowledge base, (also denoted by σ_(κ)(L)), is equal        to the largest G⊆U where

$G{\overset{d}{arrow}}_{\kappa}{L.}$

-   -   At 230, the “plausibility” of target rule L with respect to the        augmented knowledge base, (also denoted by σ _(κ)(L)), is equal        to [σ_(κ)(L′)]′.

At 220, the concept of “Support” of L provides a natural way todetermine the set of evidences included in and derivable from κ thatdeductively supports L. While at 230 the concept of “plausibility” of Ldetermines the set of evidences included in and derivable from κ thatdoes not deductively support (not L.) (See also FIG. 3 ). At 210,whether “support” of L and/or “plausibility” of L should be taken intoaccount can depend upon whether strict and/or plausible support issought.

240 and 250 are alternative determinations of the “support” of a targetrule L and a “plausibility” of a target rule L. If G is a set or asubset of some universal set U, then G′ will denote the complement ofthe set G, i.e., G′ contains all the elements in U that is not in G.

Let κ be an AKB and L E→

. One way of determining the support and plausibility are:

-   -   σ_(κ)(L)=σ_(κ) ₀ (F) where κ₀=κ∪{U→L′} (240), and    -   σ _(κ)(L)=σ _(κ) ₀ (T) where κ₀=κ∪{U→L} (250).

In other words, to compute the support of L with respect to the AKB κ,add the object U→L′ (where L′ is a negated target rule L) to the AKB κ,and then compute the support of F with respect to the expanded AKB. Theplausibility can be computed in a similar manner.

Thus, the above methods (240 and 250) can be used to determine thesupport and plausibility of any Lϵ

by extending the original AKB κ. It is a universal method since it isonly necessary to know how to determine the support of F or theplausibility of T for any AKBs. (See also FIG. 3 )

Various algorithms for determining σ_(κ)(F), some polynomial and somenon-polynomial, are given in Section 15 (Unification Algorithms) below(see FIG. 4 which is a flow diagram of an example unification algorithm,according to an aspect of an embodiment of the invention). In light ofthe above results, these algorithms can be used to determine the body ofevidences that supports any L in an AKB.

In the above definition of AKB, we assume that it is universallyaccepted that E→A is true. If that is not the case, then E→A issupported by some body of evidences, say G₀. In other words, G₀→(E→A).If the latter is universally accepted, then we can stop there.Otherwise, it in turn is supported by yet another body of evidences, adinfinitum. The latter are clearly higher order AKB. Since G₀→(E→A) isequivalent to (G₀∩E)→A, thus, in most cases, it is only necessary toconsider first-order AKB, the one defined above (see Section 7 for moredetail discussions of higher order AKBs, as well as, the role of higherorder AKBs). It is worthwhile to observe that if E=U for all (E→A) ϵκ,then we have a deterministic knowledge base, or a knowledge base withoutuncertainty.

According to an aspect of an embodiment, the way uncertainties and/orincompleteness is dealt with by 220, 230, 240 and/or 250 for deductivereasoning is unique and new, because both the composite rules and thecomposite sets of evidences are involved providing a more general,vigorous and stronger reasoning scheme.

Section 2. Evidences, Measures and Extensions

Any collection of sets is partially ordered by ⊆. Therefore the supportfunction σ, or the plausibility function σ, induced a partial orderingon

. Using this partial ordering, if L₁, L₂ ϵ

and σ(L₁)⊆σ(L₂), we can say that L₁ has lesser support than L₂. However,if σ(L₁) and σ(L₂) are incomparable, then we don't have any measure ofthe relative strength of support for L₁ and L₂. In this section, weshall discuss the extension of the partial order into a linear order,or, more importantly, the definition of an absolute measure of strengthof support for members of

.

Unlike other knowledge-based systems, evidences play a front and centerrole in AKBs. However, a body of evidences could be an abstract concept.It is difficult, if not impossible, to completely specify all theevidences. Fortunately, in an embodiment of this invention, it is notessential to specify completely what the body of evidences consists of.One needs only to know the relations among the bodies of evidences, andthe strength or validity of each body of evidences. These can in turn beused to determine the strength or validity of Lϵ

.

We now present various methods, in the form of functions or mappingswith domain 2^(U), the collection of all subsets of U, for use with AKBsto measure the strength or validity of the body of evidences associatedwith a rule.

FIG. 5 provides the types of measures, used for representing thestrength or validity of the body of evidences, according to anembodiment of the present invention.

Let ϵ⊆2^(U), where 2^(U) denotes the collection of all subsets of U. Ameasure m over ϵ is a function from ϵ into a linear ordered set, suchthat for all E₁, E₂ϵ∈, m(E₁) m(E₂) whenever E₁⊆E₂. In other words, in anaspect of an embodiment of this invention, measures are monotonic.

Let ∈⊆2^(U). A probability measure m over ∈ is function from ∈∪{Ø, U}into [0,1], the close unit interval, which satisfies the followingconditions:

-   -   1. m(Ø)=0 and m(U)=1.    -   2. m(E₀)≥m(E₁)+m(E₂) for all E₀, E₁, E₂ϵ∈ where E₁∩E₂=Ø0 and E₁,        E₂⊆E₀.    -   3. m(E₁∪E₂)≤m(E₁)+m(E₂) for all E₁, E₂ ϵ∈ where E₁∪E₂ϵ∈.

In the above definition, as well as the rest of this document, Ø denotesthe ‘empty set’ or the set that contains no elements.

Clearly, every probability measure over ∈ is a measure over ∈. Moreover,if ∈=2^(U), then the definition of probability measure given above isequivalent to the usual definition of probability measure.

Measures are also used in virtually all of the existing knowledge-basedsystems that deal with uncertainty. However, the measure considered inAKBs map a body of evidences into a member of a linear ordered set orinto a real number. On the other hand, measures in existingknowledge-based systems map a rule directly into a real number.

Let κ be an AKB. m is a (probability) κ-measure if and only if m is a(probability) measure over ∈_(κ), the collection of all sets (body ofevidences) given in an AKB.

Let κ be an AKB. The extension of ∈_(κ), denoted by {tilde over(∈)}_(κ), is the smallest collection of subsets of U that contains allthe sets in ∈ and is closed under complement, union, and intersection.In other words, {tilde over (∈)}_(κ) includes sets which can be formedfrom ∈ in a natural manner.

Moreover, let m a (probability) κ-measure. An extension {tilde over (m)}of m is a (probability) measure over {tilde over (∈)}_(κ) such that{tilde over (m)}(E)=m(E) for all Eϵ∈.

In the case where {tilde over (m)} is probabilistic, the value {tildeover (m)}(σ_(κ)(L)), for Lϵ

, can be interpreted as the probability that L is true.

By viewing conditioning/composition rules as extension schemes, it isclear that the ability to extend m tom is equivalent to the ability toinfer or to reason in existing knowledge-based systems. In an AKB,inference or reasoning can be done in two separate phases:

-   -   1. Form the body of evidences that supports the target rules.    -   2. Determine the value of {tilde over (m)} for the body of        evidences obtained in the first phase.

There are many schemes for extending a κ-measure m with domain ∈_(κ)to{tilde over (m)} with domain {tilde over (∈)}_(κ). Many try to capturethe essence of the following, which holds for probability measure.

Let m be a probability measure over 2^(U) and G, G₁, G₂⊆U. Then msatisfies the following conditions:

-   -   1. {tilde over (m)}(E′)=1-m(E).    -   2. max(0, m(G₁)+m(G₂)-1)≤m(G₁∩G₂)≤min(m(G₁), m(G₂)).    -   3. max(m(G₁), m(G₂))≤m(G₁∪G₂)≤min(1, m(G₁)+m(G₂)).

The above observation, which can also be found in Eugene S. Santos andEugene Santos, Jr. Reasoning with uncertainty in a knowledge-basedsystem. In Proceedings of the seventeenth international symposium onmulti-value logic, Boston, Mass., pages 75-81, 1987—the result are thesame but are use differently), was used there to compute theprobabilities of the union and intersection, are used in this inventionto determine a lower bound and an upper bound for m(E), for everyEϵ{tilde over (∈)}, provided m is a probability measure. Unfortunately,after applying the results a few times, the lower and upper boundquickly approaches 0 and 1, respectively, yielding very minimal usefulinformation.

Schemes for extending a measure m over ∈ to a measure {tilde over (m)}over {tilde over (∈)} are provided.

-   -   1. {tilde over (m)}(E′)=1-m(E),    -   {tilde over (m)}(E₁∩E₂)=min(m(E₁), m(E₂)), and    -   {tilde over (m)}(E₁∪E₂)=max(m(E₁), m(E₂)).    -   2. {tilde over (m)}(E′)=1-m(E),    -   {tilde over (m)}(E₁∩E₂)=max(0, m(E₁)+m(E₂)-1), and    -   {tilde over (m)}(E₁ U E₂) =min(1, m(E₁) +m(E₂)).    -   3. {tilde over (m)}(E′)=1-m(E),    -   {tilde over (m)}(E₁∩E₂)=m(E₁)m(E₂), and    -   {tilde over (m)}(E₁∪E₂)=1-(1-m(E₁)) (1-m(E₂)).

Schemes 1, 2 and 3 above all satisfy the DeMorgan's law, and can beapplied to compute {tilde over (m)}(E) for any E ϵ{tilde over (∈)}_(κ).However, one could run into various problems by using these schemesindiscriminately, as witness by many existing knowledge-based systemsthat use expressions similar to some of those given above to manipulatetheir numbers. Worst yet, for all three schemes above, the values of{tilde over (m)}(E), where E ϵ{tilde over (∈)}_(κ), may not be unique,since they may depend on how E is expressed. Moreover, Schemes 1 and 2are not probabilistic (they are not additive). On the other hand, Scheme3 implicitly assume that all the sets in E are stochasticallyindependent. Due to this implicit assumption, Scheme 3 has not beenstrictly applied in real applications.

Of course, many other schemes existed in the literature. Many aread-hoc, such as the composition rule for certainty factors used in theoriginal MYCIN (Bruce G. Buchanan and Edward H. Shortliffe. Rule-BasedExpert Systems. Addison Wesley, 1984). There are also other schemes thatare essentially modifications and/or combinations of the above schemes,such as the composition rule (orthogonal sum) used by theDempster-Shafer approach (G. Shafer. A Mathematical Theory of Evidence.Princeton University Press, 1976). Finally, there is a natural schemefor extending a probabilistic m into a probabilistic {tilde over (m)},namely, the constraint stochastic independence method introduced in thelatter part of this invention, which can be used to form a new reasoningscheme, the augmented reasoning. This scheme is particularly suited forAKBs.

It is not necessary to select just one extension scheme for m for agiven AKB. It is possible, and, in fact, beneficial at times to use morethan one schemes either simultaneously and/or conditionally. In thelatter case, several schemes may be provided and based on L and/or how Lis formed, a particular scheme is then selected to evaluate the validityof L. A version of multiple schemes that is being used conditionally isgiven in (L. A. Zadeh. A simple view of the Dempster-Shafer theory ofevidence and its implication for the rule of combination. Al Magazines,7:85-90, 1986) in connection with fuzzy knowledge bases. However, inthis case, as in virtually all existing cases, the extension is applieddirectly to the rules rather than the bodies of evidences.

Clearly, a complete formulation of an AKB requires the specifications ofκ, m, and the scheme for extending m to {tilde over (m)}. The extensionscheme is just one of the facets in an AKB, and any scheme could be usedfor that purpose. By selecting appropriately the E, the A, and the E→A,as well as, m and the extension scheme for m, most of the well-knownapproaches to knowledge bases and uncertain reasoning, includingprobabilistic logic (N. J. Nilsson. Probabilistic logic. ArtificialIntelligence, 28:71-87, 1986), Dempster-Shafer theory (G. Shafer. AMathematical Theory of Evidence. Princeton University Press, 1976),Bayesian network(Judea Pearl. Probabilistic Reasoning in IntelligentSystems: Networks of Plausible Inference. Morgan Kaufmann, 1988),Bayesian knowledge base Eugene Santos, Jr. and Eugene S. Santos. Aframework for building knowledge-bases under uncertainty. Journal ofExperimental and Theoretical Artificial Intelligence, 11:265-286, 1999),fuzzy logic (L. A. Zadeh. The role of fuzzy logic in the management ofuncertainty in expert systems. Fuzzy Sets and Systems, 11:199-227, 1983;L. A. Zadeh. Fuzzy sets. Information and Control, 8:338-353, 1965. 41;R. Yager. Using approximate reasoning to represent default knowledge.Artificial Intelligence, 31:99-112, 1987), numerical ATMS (JohanDeKleer. An assumption-based TMS. Artificial Intelligence, 28:163-196,1986), incidence calculus (Alan Bundy. Incidence calculus: A mechanismfor probabilistic reasoning. Journal of Automated Reasoning,1(3):263-283, 1985; Weiru Liu and Alan Bundy. Constructing probabilisticatmss using extended incidence calculus. International Journal ofApproximate Reasoning, 15(2):145-182, 1996), etc. can be formulated asspecial cases of AKBs (see Section 3).

Section 3. Comparative Studies

The definition of AKB given in Section 1 encompasses virtually all ofthe widely used formalisms of uncertain reasoning and knowledge bases.As we shall see below, these existing systems can be formulated as AKBsaccording to the embodiments of the present invention by imposingcertain restrictions on the relations among the E, the A, and the (E→A)in κ, as well as, on how the κ-measure m is extended.

For illustration purposes, we shall consider the following well-knownformalisms: probabilistic logic(N. J. Nilsson. Probabilistic logic.Artificial Intelligence, 28:71-87, 1986), Bayesian network(Judea Pearl.Probabilistic Reasoning in Intelligent Systems: Networks of PlausibleInference. Morgan Kaufmann, 1988) and Bayesian knowledge base (EugeneSantos, Jr. and Eugene S. Santos. A framework for buildingknowledge-bases under uncertainty. Journal of Experimental andTheoretical Artificial Intelligence, 11:265-286, 1999), andDempster-Shafer theory(G. Shafer. A Mathematical Theory of Evidence.Princeton University Press, 1976).

A probabilistic logic can be defined by an ordered pair (

, p), where

is a finite collection of elements in

, a first order logic, and p is a function from

into [0,1]. For simplicity, instead of a first order logic, assume that

is a propositional calculus. For each Pϵ

, associate the set E_(p), and let κ be the AKB including all E_(p)→Pand E_(p)′→P′, where Pϵ

. (There is a one-to-one correspondence between P and E_(p), where E_(p)is the set version of the logical expression P.) Moreover, define m suchthat m(E_(p))=p(P) for all Pϵ

. Extend m into {tilde over (m)} that satisfies the 3 properties ofprobabilistic measure given in Section 2. It can be shown that theresulting AKB κ is equivalent to (

, p).

Both Bayesian networks and Bayesian knowledge bases can be defined byspecifying a finite collection

of conditional probabilities subject to certain restrictions. Let Rdenote the collection of all random variables used in the Bayesiannetwork or Bayesian knowledge base, and for each Xϵ

, let I_(X) represents the set of all possible instantiations of X. Atypical element A of

is of the form: p(X=α|X₁=α₁, X₂=α₂, . . . , X_(κ)=α_(κ)), where Xϵ

, αϵI_(X), and for all 1≤i≤k, X_(i)ϵ

, and α_(i)ϵI_(X) _(i) .

Associate with each element Aϵ

: the rule

-   -   (X₁=α₁)∧(X₂=α₂)∧ . . . ∧(X_(κ)=α_(κ)) =(X=α),        and the set E_(A).

Let κ be the AKB containing one or more (e.g., all) objects of the formE_(A)→A where Aϵ

. Moreover, define m such that:

-   -   m(E_(A))=p(X=α|X₁=α₁, X₂=α₂, . . . , X_(κ)=α_(κ)).

Let A=((X₁=α₁)∧A (X₂=α₂)∧ . . . ∧(X_(κ)=α_(κ))⇒(X=α)) ϵ

_(κ), and =((Y₁=b₁)∧(Y₂=b₂)∧ . . . ∧(Y_(l)=b_(l))⇒(Y=b)) ϵ

_(κ). A˜B means for all 1≤i≤k and 1≤j≤l, α_(i)=bj whenever X_(i)=Y_(j).In other words, A and B are antecedent-compatible. In order to capturethe properties of conditional probabilities, the following conditionsare needed: For every A, B ϵ

_(κ), if A˜B, X=Y but a≠b, then E_(A)∩E_(B)=Ø.

However, for Bayesian knowledge bases, an additional property is needed:For every A, B ϵ

_(κ), if A˜B,X=Y and a=b, then E_(A)=E_(B). (This is the exclusivityproperty given in Eugene Santos, Jr. and Eugene S. Santos. A frameworkfor building knowledge-bases under uncertainty. Journal of Experimentaland Theoretical Artificial Intelligence, 11:265-286, 1999.)

On the surface, it might seem that Bayesian networks and Bayesianknowledge bases can deal with relationships or constraints among thesets. Actually, they cannot handle constraints unless they are inherentin probability theory or in their definition. In other words, neithercan handle externally imposed constraints.

Bayesian networks and Bayesian knowledge bases essentially employedScheme 2 given in Section 2 with the proviso that {tilde over(m)}(A∩B)=0 whenever A∩B=Ø. It can be shown that the resulting AKB κ, asdefined above, are equivalent to the corresponding Bayesian network orBayesian knowledge base.

Another well-known reasoning scheme is based on Dempster-Shafer theory(DST)(G. Shafer. A Mathematical Theory of Evidence. Princeton UniversityPress, 1976). Using DST terminology, consider a simple support functionS with focal element A and degree of support s. S can be represented bythe objects E→A and E′→A′, with m(E)=s and m(E′)=1-s. If relations existamong the focal elements, then additional objects have to be included inthe AKB. For example, suppose A and B are disjoint focal elements, thenwe need to include the object U→A′∨B′ in the AKB. (Although focalelements in DST are sets, we view them as propositions to fit them intoour formulation.) If the frame of discernment Θ is given, another way ofrepresenting the support functions, without resorting to usingadditional objects to specify the relations among focal elements, is towrite: A={θ₁, θ₂, . . . , θ_(κ)}⊆Θ as θ₁∨θ₂∨ . . . ∨θ_(κ).

In this case, we need only add, once and for all, the objects U→a′∨b′for distinct a, bϵΘ. DST employed Scheme 2 given in Section 2 to extendthe K-measure m and then normalized the result by dividing it by1-{tilde over (m)}(σ_(κ)(F)). It can be shown that the orthogonal sumused in DST can be derived from the above representation. Although DSTemphasized evidence in its formulation, like all other existing models,it did not distinguish between the evidence and the proposition itsupports.

Section 4. Augmented Knowledge Base Inference Engine

In this section, we shall show how inference or reasoning is done in anAKB. From the above discussions, it is clear that, in an embodiment ofthis invention, the functions of the augmented knowledge base engine(AKBE) can be divided neatly into two phases. The first phase deals onlywith the AKB κ (or any desired subset thereof, in the case ofdistributed knowledge bases), while the second phase deals with theK-measure and its extension. As noted earlier, it is possible to usemore than one extension schemes in a single AKB. Moreover, depending onthe extension scheme(s) used, it may not be necessary to complete thefirst phase before starting the second phase. In particular, the twophases, as well as the extension schemes, if two or more schemes areused simultaneously, maybe carried out in parallel to make the processmore efficient.

The main thrust of the first phase of an AKBE is the evaluation ofσ_(κ)(L). By virtue of results given in Section 1, we need only providea method for determining σ_(κ)(F) for arbitrary AKB_(κ). Many suchalgorithms are given in Section 15 (Unification Algorithms).

The main thrust of the second phase of an AKBE is the computation of{tilde over (m)}(σ_(κ)(F)). This clearly depends on the scheme used forextending m. As stated earlier, the result in Section 2 can be used toprovide a lower bound and an upper bound for {tilde over (m)}(E) for anyprobabilistic extension of m as long as m itself is probabilistic.Unfortunately, after applying the theorem a few times, the lower andupper bound quickly approaches 0 and 1, respectively, yielding veryminimal useful information. Nevertheless, both the lower and upperbounds can be computed easily, and thus it provides a very efficient wayfor bounding the value of {tilde over (m)}(E).

Several extension schemes are given in Section 2. Clearly, all of thoseschemes are intended to be used to compute {tilde over (m)}(E) directly,provided Eϵ{tilde over (∈)}_(κ), without any regard to the relationsamong the sets in {tilde over (∈)}_(κ) (e.g. Is E₁ a subset of E₂, orare they disjoint?). There are many other known extension schemes and/orconditioning/composition rules. However, they are either toorestrictive, or have no clear semantics and thus suffer from variousanomalies.

A promising approach is the use of the extension scheme based onconstraint stochastic independence method mentioned earlier anddiscussed in detail below. It provides a unique point value for {tildeover (m)}(E), and forms the basis for a new computational model ofuncertain reasoning. A unique feature of this new reasoning scheme isits ability to deal with relationships among the bodies of evidences, orits capability to take into account constraints imposed on the bodies ofevidences. This is a clear departure from all existing formalisms foruncertain reasoning, which are incapable of dealing with suchconstraints. This added feature can lead to more powerful, robust,accurate and/or refined characterizations of the knowledge base. Inaddition, because of its clear probabilistic semantics, virtually all ofthe known anomalies associated with existing knowledge-based systemsdisappear.

Although our discussion on AKBE focuses on σ_(κ), the same method can beused for σ _(κ) or other other appropriate functions. In addition, if κis consistent (see Section 5), and both σ_(κ) and σ _(κ) are considered,then one can determine an interval value ({tilde over (m)}(σ_(κ)(L)),{tilde over (m)}(σ _(κ)(L))) for any Lϵ

.

In general, AKBs allow rules to be supported by bodies of evidences thatare related to other bodies of evidences. For example, two differentrules may be supported by the same body of evidences, or a rule issupported by E while another rule is supported by E′. If one needs toknow which specific rules, not just which specific bodies of evidences,that are applied to obtain the desired results, such as in PROLOG-likequery, Σ_(κ)(L)={ωϵ{circumflex over (κ)}|τ(ω)⇒L} can first bedetermined. Clearly, all the rules involve in the inference are includedinΣ_(κ)(L). One can then use the definition, together with the givenconstraints, to determine σ_(κ)(L).

Providing explanations on how results were obtained by anyknowledge-based system will go a long way in bolstering the user'sconfidence on the system. For AKBs, the explanation is built-in. All therules that are involved in deducing the final results are given inΣ_(κ)(L). Moreover, the overall body of evidences supporting L is givenin σ_(κ)(L). Furthermore, all constraints involved, when constraintstochastic independence method is employed, can be readily extractedduring the second inference phase.

It is worth noting that the Σ_(κ)(L) and/or σ_(κ)(L) obtained in thefirst phase of the AKBE, contains a mountain of treasures (information)waiting to be exploited. We have seen how it can be used to determinethe support of L, as well as, how to explain away L. Many otherconcepts, such as consistency, reliability, vulnerability, etc., whichare central to many different complex systems, as well as, deception,which is central to knowledge-based systems, can also be formulated anddetermined via Σ_(κ)(L), σ_(κ)(L) and/or {tilde over (m)} (see Section 5below).

Section 5. Properties of AKB

FIG. 7 is a table of methods for testing and ensuring properties of AKB,according to an embodiment of the present invention.

In this section, certain basic properties of AKBs are presented togetherwith methods for testing and ensuring that the AKB possesses theproperties.

Let κ be an AKB. κ is consistent if and only if G=Ø, subject to

_(κ), whenever

$G{\overset{d}{arrow}}_{\kappa}{F.}$

_(κ)is the collection of all constraints involving κ. This means thatfor consistent κ, if a rule is FALSE, then it is not supported by anyevidences.

Consistency is an important issue involving an AKB. It specifies theinternal consistencies of the contents of the AKB. For example, if bothE₁→A and E₂→A′ are in κ, then E₁ and E₂ should be disjoint for κ to beconsistent. We need a higher order AKB to deal with inconsistencies (seeSection 7). If an AKB is not consistent, then the (E→A) in κ may not beuniversally accepted. Inconsistency is closely related to reliability.It could also be a precursor of something important, such as deception,when dealing with knowledge bases not known for their inconsistencies.

AKBs associated with Probabilistic logic(N. J. Nilsson. Probabilisticlogic. Artificial Intelligence, 28:71-87, 1986), Bayesian networks(JudeaPearl. Probabilistic Reasoning in Intelligent Systems: Networks ofPlausible Inference. Morgan Kaufmann, 1988) and Bayesianknowledge-bases(Eugene Santos, Jr. and Eugene S. Santos. A framework forbuilding knowledge-bases under uncertainty. Journal of Experimental andTheoretical Artificial Intelligence, 11:265-286, 1999) are consistent.On the other hand, consistency is not strictly observed inDempster-Shafer theory (G. Shafer. A Mathematical Theory of Evidence.Princeton University Press, 1976). This omission, together with itscomposition rule, can lead to certain anomalies (L. A. Zadeh. On thevalidity of Dempster's rule of combination of evidence. Technical Report79/24, University of California, Berkeley, 1979; L. A. Zadeh. Amathematical theory of evidence (book review). AI Magazines, 55:81-83,1984; L. A. Zadeh. A simple view of the Dempster-Shafer theory ofevidence and its implication for the rule of combination. AI Magazines,7:85-90, 1986).

In general, consistency imposed certain conditions that the E′s shouldsatisfy. Some criteria for consistency are given below:

Let κ be an AKB. The following statements are equivalent:

-   -   1. κ is consistent.    -   2. G=Ø, subject to        _(κ), whenever

$G{\overset{d}{arrow}}_{\kappa}{F.}$

-   -   3. For every ωϵ{circumflex over (κ)}, l (ω))=Ø whenever τ(ω)=F.    -   4. σ_(κ)(F)=Ø.    -   5. τ _(κ)(T)=U.    -   6. For every Lϵ        , σ_(κ)(L) ∩τ_(κ)(L′)=Ø.    -   7. For every Lϵ        , σ _(κ)(L)∪σ _(κ)(L′)=U.    -   8. For every Lϵ        , σ_(κ)(L)⊆σ _(κ)(L).

Any of the criteria above can be used to test and ensure that the AKB isconsistent. As observed earlier, an inconsistent AKB is actually ahigher order AKB. Although inconsistencies can be removed from an AKB κby adding the constraint σ_(κ)(F)=Ø, this method may not be advisablesince it could alter the nature of the AKB. Besides, all inconsistenciesshould be handled with care. They may occur due to errors. Moreimportantly, they may occur due to possible deceptions (see below),especially if the AKB is not known to have inconsistencies.

Let κ be an AKB. κ is complete if and only if for every Lϵ

, σ _(κ)(L). κ is perfect if and only if for every Lϵ

, σ _(κ)(L)=σ_(κ)(L).

Perfect AKBs are those AKBs whose support and plausibility for any L E→

are equal. This turns out to be a very powerful property.

Additional methods for testing and ensuring that an AKB is perfect aregiven below:

Let κ be an AKB. The following statements are equivalent:

-   -   1. κ is perfect.    -   2. For every Lϵ        , σ _(κ)(L)=σ_(κ)(L).    -   3. κ is both consistent and complete.    -   4. κ satisfies the following conditions:        -   a) σ_(κ)(L′)=[σ_(κ)(L)]′;        -   b) σ_(κ)(L₁∧L₂)=σ_(κ)(L₁)∩σ_(κ)(L₂); and        -   c) σ_(κ)(L₁∨L₂)=σ_(κ)(L₁)∪σ_(κ)(L₂).

In light of 4 of the above result, if κ is perfect, then for every Lϵ

σ_(κ)(L) is completely determined by σ_(κ)(A), where Aϵ

_(a), and

_(a) is the collection of all atomic proposition in

. An atomic proposition is a proposition that is indivisible, i.e., itcannot be expressed in terms of other propositions. A rule A is acombination of or formed of one or more atomic propositions.

We shall now present some other important properties involving AKBs, andshow their possible applications.

Let κ be an AKB, m a κ-measure, {tilde over (m)} an extension of m over{tilde over (∈)}_(κ), λ⊆κ, and Lϵ

.ψ_(κ)(λ, L)={tilde over (m)}(σ_(κ() L)-{tilde over (m)}(σ_(κ-λ)(L)).

ψ_(κ)(λ, L) will be referred to as the contribution of λ over L. If λϵκ,then we shall also write: ψ_(κ)(μ, L) for ψ_(κ)({μ}, L).

The concept of contribution can be used to detect and overcomevulnerabilities, as well as, detect possible deceptions (see below).

Let κ be an AKB, μϵκ and Lϵ→

. μ is an essential support for L if and only if ψ_(κ)(μ, L)>0.

Clearly, all the vulnerabilities of κ with respect to L can bedetermined from S_(κ)(L)={μϵκ|μ is an essential support for L}, whichcan in turn be obtained from Σ_(κ)(L). As a matter of fact, we can orderthe elements μ of S_(κ)(L) in descending order of ψ_(κ)(μ, L). In thisway, one can pinpoint all the major vulnerabilities of κ with respect toL. Sometimes, vulnerabilities are not due to a single element of κ. Inthis case, Σ_(κ)(L) should be examined to determine group of elements ofκ that may be the root cause of major vulnerabilities.

Detecting possible deceptions involving L can become more complicatedsince it involves human and/or another corrupted machine. Nevertheless,a method to start the detection process can be carried out by firstdetermining the major single and/or multiple vulnerabilities withrespect to L, as explained above, and carefully examining which of themajor single and/or multiple vulnerabilities are plugged usingquestionable sources. In general, deceptions might involved altering,adding and/or deleting multiple elements of K. For security purposes, noexisting element of κ should be allowed to be removed without thoroughinvestigation to assure that the element in question is useless, notappropriate and/or superseded. All changes should be introduced as newmembers of κ (deletion of (E→L)ϵκ can be accomplished by temporarilyadding the new member ℏ→L to κ). A log should be provided for anychanges, addition or physical deletion of any member of κ.

Section 6. Inductive Inference

FIG. 8 is flow diagram of inductive inference in AKB, according to anembodiment of the present invention.

Inductive inference comes in many different flavors and has been widelystudied in the literature (Aidan Feeney and Evan Heit, editors.Inductive Reasoning : Experimental, Developmental, and ComputationalApproaches. Cambridge University Press, 2007; John H. Holland, Keith J.Holyoak, Richard E. Nisbett, and Paul R. Thaggard. Induction: Process ofInference, Learning, and Discovery. MIT Press, 1989). In this section,we provide a method for performing inductive inference in AKBs. Theinductive inference presented in this document is not only more general(capable of handling uncertainties and/or incompleteness), but also morerobust, than traditional inductive inference. Together with the resultsgiven above, AKB can serve as a unified framework for deductive andinductive reasoning. Furthermore, in view of the results that will bepresented in this section, we can view deductive reasoning and inductivereasoning as dual of each other.

Let κ be an AKB, G⊆U and Lϵ

.

$G{\overset{i}{arrow}}_{\kappa}L$(where i dentoes inductive reasoning) if and only if there exists L₀ϵ

where L₀ refers to a composite rule such that the target rule L⇒L₀ andG→_(κ)L₀.

This is equivalent to there exists ωϵ{circumflex over (κ)} such that l(ω)=G and L⇒r(ω). Observe the difference between

$G{\overset{i}{arrow}}_{\kappa}{{L{and}G}{\overset{d}{arrow}}_{\kappa}{L.}}$

In other words,

$G{\overset{i}{arrow}}_{\kappa}L$means there exists a composite object ω in the AKB κ (ω is formed fromobjects in κ), where G is the set of evidences associated with ω, andthe rule associated with ω is implied by L

$G{\overset{i}{arrow}}_{\kappa}L$is one of the key concepts for dealing with inductive reasoning andother aspects of AKBs. Among other things, it provides a natural way toshow how a given Lϵ

is inductively related to κ. Let κ be an AKB and Lϵ

.

-   -   1. The inductive plausibility for L with respect to κ, (also        denoted by ϕ_(κ)(L)), is the smallest G⊆U where

$G{\overset{i}{arrow}}_{\kappa}{L.}$

-   -   2. The inductive support for L with respect to κ, (also denoted        by ϕ _(κ)(L)), is equal to [ϕ_(κ)(L′)]′.        In other words, inductive plausibility for L with respect to κ        is the smallest set of evidences G such that

${G{\overset{i}{arrow}}_{\kappa}L},$while inductive support for L with respect to κ is the largest set ofevidences G such that

$G{\overset{i}{arrow}}_{\kappa}{L^{\prime}.}$

Let κ be an AKB. Let ωϵ{circumflex over (κ)}. The complement ω′ of ω isdefined recursively as follows:

-   -   1. If ω=(E→A) where ωϵ{circumflex over (κ)}, then ω′=(E′→A′).    -   2. If ω₁, ω₂ϵ{circumflex over (κ)}, then (ω₁∨ω₂)′=ω₁′∧ω₂′ and        (ω₁∧ω₂)′=ω₁′∨ω₂′.        Clearly, for any composite object of ω, ω′ is obtained from ωby        taking the complement of the set of evidences portion of ω, and        the negation of the rule portion of ω. The complement ω′ of ω is        needed to show the relations between deductive inference and        inductive inference.

Let Ω⊆{circumflex over (κ)}. Ω={ω′|ωϵΩ}.

Let τ⊆κ. τ={ω′|ωϵτ}. When κ is viewed as an AKB, then

_(κ) , =

_(κ).

The following properties show that inductive support and inductiveplausibility with respect to κ are closely related to plausibility andsupport with respect to κ. They also provide a method for computing theinductive support and inductive plausibility with respect to κ using theUnification Algorithms (see Section 15) for computing the support andplausibility with respect to κ.

Let κ be an AKB and Lϵ

. ϕ_(κ)(L)=σ _(κ) (L) and ϕ _(κ)(L)=σ _(κ) (L).

Clearly, reasoning using σ_(κ) corresponds to deductive inference orreasoning. On the other hand, reasoning using ϕ_(κ) may be viewed asinductive inference or reasoning. ϕ_(κ) can be used for extracting newknowledge from κ. In view of the above results, κ may be viewed as theAKB induced by κ using inductive inference. However, it should beobserved that, even if κ is consistent, new knowledge extracted from κ,need not be consistent with one another; i.e., κ need not be consistent.In the rest of this section, we shall provide necessary and sufficientconditions for the new knowledge extracted from κ to be consistent.

In view of the previous paragraphs, inductive inference can beaccomplished using deductive inference, as follows: construct thecomplement κ (410 ) from κ (400), construct the negation L′ (412) from L(405), perform deductive inference on κ and L′ (420) to determine thesupport of L′ with respect to κ, and other items in (422), and then usethe relation ϕ_(κ)(L)=ϕ _(κ) (L) given above to compute inductiveplausibility of L with respect to κ, as well as other items (424).

Let κ be an AKB. The following statements are equivalent:

-   -   1. κ is i-consistent.    -   2. G=U, subject to        _(κ), whenever

$G{\overset{i}{arrow}}_{\kappa}{T.}$

-   -   3. For every ωϵ{circumflex over (κ)}, l (ω)=U whenever τ(ω)=T    -   4. ϕ_(κ)(T)=U.    -   5. ϕ _(κ)(F)=Ø.    -   6. For every Lϵ        , ϕ_(κ)(L)∪ϕ_(κ)(L′)=U.    -   7. For every Lϵ        , ϕ _(κ)(L)∩ϕ _(κ)(L′)=Ø.    -   8. For every Lϵ        , ϕ _(κ)(L)⊆ϕ_(κ)(L).    -   9. κ is consistent.

Any of the criteria above can be used to test and ensure that the AKB isi-consistent. It follows from the above results that if κ isi-consistent, then κ is consistent. In this case, the new knowledgeextracted from κ will be consistent with each other.

Since κ is i-consistent if and only if κ is consistent, in the samemanner, we shall say that κ is i-xxx if and only if κ is xxx. Moreover,we shall say that κ is dual-xxx if and only if both κ and κ are xxx.Therefore, all results given in Section 5 can be transformed intocorresponding results involving i-complete and/or i-perfect using theabove results.

Let κ be an AKB. κ is monotonic if and only if for every ω₁, ω₂ϵ{circumflex over (κ)}, if τ(ω₁)⇒τ(ω₂), then l (ω₁)⊆l(ω₂).

Let κ be an AKB. The following statements are equivalent:

-   -   1. κ is monotonic.    -   2. κ is monotonic.    -   3. For every ω₁, ω₂ ϵ{circumflex over (κ)}, if τ(ω₁)Ψτ(ω₂), then        l (ω₁)=l (ω₂).    -   4. For every ωϵ{circumflex over (κ)}, l (ω)=σ_(κ)(τ(107 ))    -   5. For every ωϵ{circumflex over (κ)}, l (ω)=ϕ_(κ)(τω)).    -   6. For every ωϵ{circumflex over (κ)}, σ_(κ)(τ(ω))=ϕ_(κ)(τ(ω)).    -   7. For every Lϵ        , σ_(κ)(L)⊆ϕ_(κ)(L).    -   8. For every Lϵ        , σ _(κ) (L)⊆σ _(κ)(L).    -   9. {umlaut over (κ)} is consistent, where {umlaut over (κ)}=κUκ,        where        _(κ)=        _(κ).

Any of the criteria above can be used to test and ensure that the AKB ismonotonic.

If κ is consistent, then the original knowledge in κ and the newlyextracted knowledge associated with κ are all consistent with eachother.

Section 7. Higher Order AKB

FIG. 9 is a flow diagram of constructing consistent higher order AKBs,according to an embodiment of the present invention. Fig, 9 is a list of3 higher order AKBs and how they are constructed. All 3 types aredescribed in this section.

In this section, unless otherwise stated, we shall assume that n is apositive integer. We shall now present higher order AKBs usinginduction.

-   -   1. A 1-st order AKB is an AKB (as defined above).    -   2. Let        ₀ be an n-th order AKB. An (n+1)-th order AKB        is a finite collection of objects of the form G→μ, where G⊆U and        μϵ        ₀. G will be referred to as the reliability of μ, and        an immediate extension of        ₀.    -   3. A general AKB is a n-th order AKB.

Let

be a general AKB. Since the objects in

have the same form as objects in an AKB, many of the basic concepts andnotations for AKB can be carried over to

, e.g., ϵ

,

, τ( ), l( )/0, etc. Moreover,

is the collection of all relations among the sets or bodies of evidencesinvolved in

and/or in

.

Let

be an n-th order AKB and n>1.

_(↓)={l(ω)∩l(τ(107 ))→τ(τ(ω))|→ωϵ

}. Observe the

_(↓) is an (n-1)-th order AKB. This provides a way to reduce an n-thorder AKB into an (n-1)-th order AKB. Moreover,

is consistent if and only if

_(↓) is consistent.

In the rest of this section, we shall present several methods forconstructing consistent second order AKBs. It includes the constructionsof

_(κ) ⁹,

_(κ) ¹, and different varieties of

_(α) ².

Let κ be an AKB.

_(κ) ⁰={σ _(κ)(τ(μ))→μ|μϵκ}.

If κ is a deterministic AKB, then

contains only the elements of κ which are consistent with all the otherelements of κ. Thus, for example, if κ contains both L is TRUE and L′ isTRUE, then neither elements will be included in

_(κ) ⁰.

Let κ be an AKB and z,97 is a second order AKB.

-   -   is an ideal immediate extension of κ if and only if        ={G_(μ)→μ|ηϵκ} and for every τϵM_(κ), α_(κ)(τ)⊆∪_(μϵτ)G_(μ)′        where α_(κ)(τ)=∩_(μϵτ)l (μ) and β_(κ)(τ)=∧_(μϵτ)τ(μ);    -   τ is F-minimal if and only if (i) β_(κ)(τ)=F and (ii)        β_(κ)(τ₀)≠F whenever τ₀ is a proper subset of τ; and    -   M_(κ) is the collection of all F-minimal subset of κ.        The above definitions and notations are needed to properly        define        _(κ) ¹ and        _(κ) ².

Let κ be an AKB.

_(κ) ¹={(γ(μ))′→μ|μϵκ}, where for μϵκ, Γ_(κ)(μ)={τϵM_(κ)|μϵτ} andγ(μ)=∪_(τϵΓ) _(κ) _(μ))α_(κ)(τ) if Γ_(κ)(μ)≠Ø. Otherwise, γ(μ)=Ø.

Given an AKB κ. Construct

_(κ) ² of κ.

-   -   1. Let        =Ø.    -   2. For each τϵM_(κ) and each μϵτ, let G(μ,τ) be a new symbol in        ∈        .    -   3. For each τϵM_(κ), add C_(κ)the constraint that the collection        {G(μ,τ)|μϵτ} forms a partition of α_(κ)(τ).    -   4. For each μϵκ, add to        the object ∩_(τϵΓ) _(κ) _((μ))(G(μ,τ))′→μ.    -   5. Return        .

Let κ be an AKB and

an ideal immediate extension of κ. If m is a κ-measure, then m can beextended to a

_(↓)-measure M. In particular, if

=

_(κ) ², then we can assign M(G(μ,τ)) to be any nonnegative values, forevery τκM_(κ)and μκτ; provided Σ_(μκτ)M(G(μ,τ))=1 for every μϵκ. Moreinformation and/or human intervention will be needed in order to assignspecific values for M(G(μ,τ)). However, if no such information isforthcoming, assign M(G(μ,τ))=1/|τ|.

The above discussions show how one could construct second order AKBs toguarantee consistency in κ, κ and/or {umlaut over (κ)}. It follows fromthe above results that new knowledge can be extracted from κ inmeaningful ways. Thus, AKBs provides a computational framework forknowledge extraction and data mining. Moreover, the selection ofdifferent ideal immediate extension for an AKB gives rise to differentknowledge extractions/data mining schemes (See Section 8).

Section 8. Free-Form Database

FIG. 10 is a diagram for illustrating the relations for generating afree-form database, according to an embodiment of the present invention.In this section, we provide methods for constructing free-form databases(FFDB), which are special type of AKBs, and show how relationaldatabases can be transformed into FFDBs. Among other things, we showthat by so doing, one can automatically deals with uncertainties and/orincompleteness in relational databases in a natural manner. Moreover, weshall use FFDBs to construct deductive and inductive databases.

Let κ be an AKB (over

and U). κ is a free-form database (FFDB) (500) if and only if everyatomic proposition in

is of the form v=v, where v is a variable and vϵV_(v), the collection ofall possible values for v. In addition, every variable v in an FFDBsatisfies the following two conditions:

-   -   1. ((v={umlaut over (v)}₁∧(v={umlaut over (v)}₂))ΨF whenever v        ₁, v ₂ϵV_(v) and v ₁≠v ₂l ; and    -   2. V _(vϵV) _(v) (v=v)ΨT;

Observe that a relational database

is made up of a finite set of relations. Consider a relation

in

with header A₁, . . . , A_(k). Then any member of

can be represented in the form:(A ₁ =α ₁)∧ . . . ∧(A _(k)=α_(k))⇒(

=T),  (1)where (

=T) can be interpreted as the relation is in the database

, and ⇒is material implication in

.

Let

be a relational database.

is smallest FFDB which contains the collection of all U→p, over allrelations

in

, where p is an expression of the form given in (1) above.

Only deterministic relational databases are defined above. However, inthe new representation, uncertainties and/or incompleteness could bereadily incorporated in κ

by changing the U in

to any appropriate subset E of U. Moreover, by using the aboverepresentation, not only is it possible to specify which relation is inthe database, but it is also possible to specify which relation is NOTin the database by setting (

=F) instead of (

=T).

The dependencies/independencies among the data are of paramountimportance in traditional databases, since these databases are subjectto constant updating. Violations of these dependencies/independenciesare the main sources of anomalies in these databases. The dependenciescan be characterized by one-to-one and/or many-to-one relationships(dependencies). A many-to-one dependency (or functional dependency) canbe expressed as:∧_(i=1) ^(k)(v_(i) ^(L)=v _(i) ^(L))⇒(∧_(j=1) ^(n)(v_(j) ^(R)=v _(j)^(R)))∧(

=T),  (2)where v _(i) ^(L)ϵV_(v) _(i) ^(L) and v _(j) ^(R)ϵV_(v) _(j) ^(R) forall i=1, 2, . . . , k and j=1, 2, . . . , n. A one-to-one dependency canbe expressed as two many-to-one dependencies.

Expression (2) reduces to expression (1) when n=0. The left hand side ofexpression (2), for n≥0, represent independencies. In other words, thevalues of the variables v₁ ^(L), v₂ ^(L), . . . , v_(k) ^(L) in (2) arearbitrary and independent of each other, and the set {v₁ ^(L), v₂ ^(L),. . . , v_(k) ^(L)} of variables forms a key. The concept ofmulti-valued dependency and join dependency (R. Fagin. A normal form forrelational databases that is based on domain and keys. ACM Transactionon Database Systems, 6:387-415, 1981) can be viewed as some sort ofindependencies/dependencies, and thus can be reformulated as such.

Let

be a relational database.

is smallest FFDB which contains the collection of all U→p, where p is anexpression of the form given in (2) above, with n≥0, over alldependencies/independencies in

. In addition,

=

∪

. In this case, we shall also say that κ is an augmented relationaldatabase (510).

The expression given by (2) above can be decomposed into the nexpressions: ∧_(i=1) ^(k)(v_(i) ^(L)=v _(i) ^(L))⇒(v_(j) ^(R)=v _(j)^(R)),j=1, 2, . . . , n, and the expression (∧_(i=1) ^(k)(v_(i) ^(L)=v_(i) ^(L)))⇒(

T).

Each of these expressions, after decomposition, is a Home clause. Sinceexpressions given by (1) are also Home clauses, therefore unificationfor any augmented relational database can be carried out in polynomialtime.

The power of FFDBs is manifested not just in its capability andflexibility in representing relational databases (Amihai Motro.Imprecision and incompleteness in relational databases: Survey.Information and Software Technology, 32(9):279-588, 1990; S. K. Lee. Anextended relational database model for uncertain and impreciseinformation. Proceedings of VLDB, pages 211-220, 1992) withuncertainties and/or incompleteness, but also, as is shown below, in itsability to represent any deductive database (Raymond T. Ng. Reasoningwith uncertainty in deductive databases and logic programs.International Journal of Uncertainty, Fuzziness and Knowledge-basedSystems, 5(3):261-316, 1997) under uncertainty, i.e, augmented deductivedatabase. Moreover, the concept of FFDB lends itself to the constructionof augmented inductive databases. These databases can serve as formalframework for extracting new information from relational databases.

Clearly, an augmented deductive database, i.e., a deductive databasecentered on the relational database

, can be represented by a FFDB κ, where

⊆κ. Using such a representation, the full force of the results obtainedfor AKBs can be applied to deductive databases. Therefore, unlikeexisting deductive databases, the augmented deductive database iscapable of handling full-fledge uncertainties and/or incompleteness (520and 525). According to an aspect of an embodiment, at 520, the deductionof 220 in FIG. 3 is applied to FFDB to obtain augmented deductivedatabases. According to an aspect of an embodiment, at 530, an inductiveversion of 230, defined in Paragraph 137, is applied to FFDB to obtainaugmented inductive databases.

Consider an FFDB κ where

⊆κ and

is a relational database. If inductive reasoning (see Section 6) areemployed instead of deductive reasoning, i.e., using ϕ_(κ) instead ofσ_(κ), then we have an augmented inductive database (530 and 535).Unlike deductive databases, we are not interested in searching,deducing, and/or generating views for inductive databases. Instead, weare interested in acquiring possible new knowledge from the database,i.e., data mining.

Most FFDBs are consistent. Nevertheless, κ may not be consistent. Thus,the construction of second order AKBs associated with κ given in Section7 is central to data mining in a relational database.

As discussed in Section 7, there are many different ways to guaranteeconsistency by selecting different associated second order AKBs. Thedifferent selections give rise to various natural schemes that can beused in data mining.

From the above discussions, it is clear that augmented inductivedatabase can serve as general framework and formal model for knowledgeacquisition and data mining in relational databases, with or withoutuncertainties and/or incompleteness.

Section 9. Average and Stochastic Independence

The main objective of the rest of this document is to show how to extendthe measure m, with domain ∈, to {tilde over (m)}, with domain {tildeover (∈)}. We are particularly interested in those cases where both mand {tilde over (m)} are probability measures. This problem is centralto many applications, in particular, reasoning in a knowledge-basedsystem involving uncertainties and/or incompleteness, such as AKBs.

In the rest of this document, we shall denote by

the set of all non-negative integers, and denote by

⁺ the set of all positive integers.

Let us interpret the probabilities as frequencies. In other words, let∈⊆2^(U) (∈ is a subset of the universal set U), |U|=n (the universal setU has n elements), and let m denotes a function from ∈ into [0, 1] suchthat for every E in ∈, m(E)=k/n, where k=|E|. (IEI denotes thecardinality of E, i.e., the number of elements in E). Thus, m is aprobability measure over ∈, and m(E) represents the probability of E.

If the sets in ∈ are explicitly specified, the problem of extending ameasure m to {tilde over (m)} can be based upon determining thecardinality of any set in {tilde over (−)}. Unfortunately, this mightnot be the case. If E is in ∈ and |E|=κ, then E can stand for any subsetof U with cardinality κ, subject to possibly some other constraintswhich will be discussed in subsequent sections.

A natural probabilistic extension for a measure m with domain ∈ to{tilde over (m)} with domain {tilde over (∈)} can be obtained by takingthe average of the probabilities of all allowable sets in {tilde over(∈)}.

Let ∈⊆2^(U), and A, Bϵ∈. Let |U|=n, |A|=a and |B|=b. Assume that A canrepresent any subset of U whose cardinality is a, and B can representany subset of U whose cardinality is b. Then the average probability ofA∩B is given by:

${\overset{\sim}{m}( {A\bigcap B} )} = {\frac{\sum_{i}\lbrack {\begin{pmatrix}n \\a\end{pmatrix}\begin{pmatrix}a \\i\end{pmatrix}\begin{pmatrix}{n - a} \\{b - i}\end{pmatrix}\frac{i}{n}} \rbrack}{\begin{pmatrix}n \\a\end{pmatrix}\begin{pmatrix}n \\b\end{pmatrix}} = {\frac{\begin{pmatrix}{n - 1} \\{a - 1}\end{pmatrix}\begin{pmatrix}{n - 1} \\{b - 1}\end{pmatrix}}{\begin{pmatrix}n \\a\end{pmatrix}\begin{pmatrix}n \\b\end{pmatrix}} = {{\frac{a}{n} \times \frac{b}{n}} = {{m(A)}{m(B)}}}}}$

The probability of any Eϵ{tilde over (∈)} above can be determined in asimilar manner. In particular, it can be shown that {tilde over(m)}(A∩B′)=m(A) [1-m(B)] and {tilde over (m)}(A∪B)=m(A)+m(B)-m(A)m(B).

Under the assumption that no constraints are imposed, the average methodused to compute {tilde over (m)} turns out to be the same as Scheme 2given above. On the other hand Scheme 2 is based on the implicitassumption that the sets in ∈ are stochastically independent. Because ofthis, Scheme 2 fails completely if constraints exist among the sets in∈. If constraints exist, the sets may no longer be stochasticallyindependent. For example, under Scheme 2, {tilde over (m)}(A∩B)=m(A)m(B) for all A and B, regardless, even in the cases where A⊆B or A∩B=Ø.

Constraints among the sets in E could be inherent in most applications.The average method discussed above will still work even when suchconstraints exist. This will be examined in detail in the subsequentsections. Indeed, as we shall see below, using the average method, a newmeasure over ∈ may be created which takes into consideration one or moreof (e.g., all) the constraints involved; and under the new measure, thesets in ∈ may be viewed as stochastically independent. For this reason,we shall refer to this new method as the constraint stochasticindependence method.

The constraint stochastic independence method is the basis of a newcomputational model of uncertain reasoning—augmented reasoning. One ofthe important advantages of this method is its ability to deal withrelationships among the bodies of evidences, or its capability to takeinto account constraints imposed on the bodies of evidences. Forknowledge-based systems, this provides a unique mechanism to resolverelationships among the bodies of evidences. As stated earlier, this isa clear departure from all existing knowledge bases, which are incapableof supporting such constraints. This added feature can lead to morepowerful, robust, accurate and/or refined characterizations of knowledgebases.

Section 10. Constraints

Let ∈⊆2^(U), and A, B, Cϵ∈, where the cardinality of U, A, B and C aren, a, b and c, respectively. Suppose the only constraint is (A∪B)∩C =Ø.(This is equivalent to the two constraints A∩C=Ø and B∩C=Ø) In otherwords, A, B and C can represent any subsets of U with cardinality a, band c, respectively, satisfying (A∪B)∩=Ø. In this case, providedm(A)+m(C)≤1 and m(B)+m(C)≤1, the average probability of A∩B can becomputed as follows:

${\overset{\sim}{m}( {A\bigcap B} )} = {\frac{\sum_{i}\lbrack {\begin{pmatrix}n \\c\end{pmatrix}\begin{pmatrix}{n - c} \\a\end{pmatrix}\begin{pmatrix}a \\i\end{pmatrix}\begin{pmatrix}{n - c - a} \\{b - i}\end{pmatrix}\frac{i}{n}} \rbrack}{( {\begin{matrix}n \\c\end{matrix}\begin{pmatrix}{n - c} \\a\end{pmatrix}\begin{pmatrix}{n - c} \\b\end{pmatrix}} } = {\frac{\begin{pmatrix}{n - 1} \\c\end{pmatrix}\begin{pmatrix}{n - c - 1} \\{a - 1}\end{pmatrix}\begin{pmatrix}{n - c - 1} \\{b - 1}\end{pmatrix}}{\begin{pmatrix}n \\c\end{pmatrix}\begin{pmatrix}{n - c} \\a\end{pmatrix}\begin{pmatrix}{n - c} \\b\end{pmatrix}} = \frac{{m(A)}{m(B)}}{1 - {m(C)}}}}$

The above example illustrates the roles constraints play in theextension of m to {tilde over (m)}, as well as, the restrictionsconstraints imposed on the values of the original m.

It is clear from the above discussions that we do not have to beinterested in the specific subsets of U. Each member E in ∈ represents apotential subset of U where |E| is fixed and E satisfies the constraintsspecified. The notation ∈⊆2^(U) does not accurately reflect thisconcept. Therefore, in an embodiment of this invention, we shallintroduce the new notation ∈

2^(U), together with certain related new concepts.

In the new notation ∈

2^(U), we can view each member of E as a set variable which can beassigned certain subsets of U. For each Eϵ∈, all subsets of U that canbe assigned to E should have the same cardinality. This commoncardinality will be denoted by |E|.

Given ∈

2^(U), E is completely defined only if |E| is specified for every Eϵ∈.Alternatively, we can specify a function m from E into ∈ into [0, 1] andlet |E|=m(E)×|U| for all Eϵ∈. Observe that m(E) is the probability ofany subset of U that can be assigned to E. We shall refer to m as ameasure over ∈, and denote by |⋅∈_(m) the cardinality induced by m,i.e., |E|_(m)=m(E)×|U| for all Eϵ∈. Clearly, if m(E)=0, then E can onlybe assigned the subset Ø; and if m(E)=1, then E can only be assigned thesubset U. For simplicity, we shall also denote by Ø (U) the set variablethat can be assigned only the set Ø (U).

The members of ∈, where ∈

2^(U), may be subject to certain constraints as illustrated above. Eachconstraint imposed further restrictions on the subsets of U that can beassigned to the set variables in ∈. It is clear from the above examplethat given a probability measure m and a collection of constraints

for ∈, some additional conditions may have to be imposed to insure thatm and

are compatible.

More formally, we start with a collection ∈ of objects. The collection{tilde over (∈)} is defined recursively as follows:

1. If Eϵ∈, then E is in {tilde over (∈)}.

2. If Z, Z₁, Z₂ ϵ{tilde over (∈)}, then Z′, Z₁ ∪ Z₂ and Z₁ ∩ Z₂ are in{tilde over (∈)}.

A constraint for ∈ is an expression of the form Z₁ rel Z₂, where Z₁, Z₂ϵ{tilde over (∈)} and rel is one of the following relational operators:⊆, ⊆, ⊆, ⊆, ⊇, ⊇, ⊇, ⊇, =, ≠.

A complete instantiation ζ of ∈ is a mapping from ∈ into 2^(U). If ζ isa complete instantiation of ∈, and W any expression or relationinvolving members of ∈, then ζ(W) is the expression or relation,obtained by replacing each Eϵ∈ that occurred in W by ζ(E). Inparticular, if W is a constraint for ∈, then ζ(W) is the set relation,obtained by replacing each Eϵ∈ that occurred in W by ζ(

). If ζ is a complete instantiation of ∈, and

is a collection of constraints for ∈, then (

) is the collection of all ζ(W) where Wϵ

. Let

be a collection of constraints for ∈, and W a constraint for ∈. W isderivable from

if and only if for all complete instantiation of ζ of ∈, ζ(W) followsfrom ζ(

). Let

₁ and

₂ be collections of constraints for ∈.

₁ and

₂ are equivalent if and only if for every constraint W for ∈, W isderivable from

₁ whenever W is derivable from

₂, and vice versa.

A complete definition of ∈

2^(I) requires the specification of a collection

of constraints for ∈(

may be empty, i.e., no constraint is imposed on ∈), and a measure m over∈. A measure m over ∈ is a function from ∈ into [0, 1] such that forevery Eϵ∈,|E|_(m)=|U|×m(E)ϵ

.

A complete instantiation of ζ of ∈is legitimate over m and

if and only if for each Eϵ∈, |ζ(E)|=|E|_(m), and for every Wϵ

, ζ(W) holds.

Let m be a measure over ∈ and ζ a legitimate complete instantiation of ∈over m and

. Let ζ_(m) be a function from ζ(∈)={ζ(E)|Eϵ∈} into [0, 1], whereζ_(m)(ζ(E))=m(E). Clearly, ζ_(m) is a probability measure over ζ(∈).

If there is at least one complete instantiation legitimate over m and

, then m and

are compatible (or m is compatible with

).

is consistent if and only if m and

are compatible for some measure m over ∈. The issue of compatibility isaddressed in Section 13 below.

Let

be a collection of constraints for ∈

2^(U), and m a measure over ∈. Let W be a relation involving m andmembers of ∈. W is valid if and only if for all complete instantiation ζof ∈ legitimate over m and

, ζ(W) holds.

In what follows, if

is a collection of constraints for ∈

2^(U), then

is assumed to be consistent. Moreover, given ∈

2^(U), if the concepts or results hold for arbitrary

, then

will not be specified. In other words, no

doesn't mean that

is empty. Similarly, if the concepts or results hold for arbitrary m,then m will not be specified.

Let ∈

2^(U),

a collection of constraints for ∈, and m a measure over ∈ which iscompatible with

. {tilde over (m)} is the function from {tilde over (∈)} into [0, 1],such that for every Eϵ{tilde over (∈)}, {tilde over (m)}(E) is theaverage of all |ζ(E)|, where ζ is taken over all complete instantiationsof ∈ which are legitimate over m and

.

Clearly, in the above definition, the extension of m to {tilde over (m)}corresponds to using the average for finding the probability of the newset. The compatibility of m and

guarantees that {tilde over (m)} is a ‘probability measure’.

In both the examples given above, {tilde over (m)}(A∩B) is well-definedor unique, since their values depend solely on the probabilities of thesets which are involved. However, this is not true in general.

It can be shown that inclusion of the relational operators of the types≠⊆,⊆,⊇ or ⊇ in a collection of constraints for ∈, may cause theextension {tilde over (m)} of m to be ill-defined, i.e., notwell-defined.

In addition, for the extension {tilde over (m)} to be well-defined, allconstraints of the form B₁∩B₂∩ . . . ∩B_(l)=Ø, where 1>2, should bederivable from other constraints. In other words, one need onlyconsiders disjoint relations or constraints of the form A∩B=Ø.Unfortunately, as we shall see in the next section, this restriction,together with the restrictions given below, are not sufficient toguarantee that the extension will be well-defined (i.e., the extensionwill depend solely on the individual probabilities).

There are many possible constraints or relations among the elements in ∈

2^(U), which do not contain relational operators of the types ≠⊆,⊂,⊇ and⊂. These constraints are listed below:

1. (disjoint) In view of remark given above, an example intersectioninvolving only two members of ∈ is described. Let A, B ϵ∈. Since A′∩B′=Øis equivalent to A∪B=U, all such relationships will be replaced byequivalent relationships discussed below (Case 6) and dealt withaccordingly. In addition, A∩B′=Ø is equivalent to A⊆B. These relationswill be converted to subset relations (case 2) and dealt withaccordingly. Thus, we shall restrict our attention only to disjointrelations of the form A∩B=Ø where both A and Bϵ∈. In addition, if A⊆Band B∩C=Ø, then A∩C=Ø. Thus, it is essential to capture these implieddisjoint relations.

2. (subset) Let A, Bϵ∈. Clearly,

A⊆B′ is equivalent to A∩B=Ø (Case 1 above);

A′⊆B is equivalent to A∪B=U (Case 6 below); and

A′⊆B′ is equivalent to B⊆A.

Therefore, it is only necessary to consider A⊆B where A, B ϵ∈. However,⊆ is a transitive relation. Thus, it is essential that we captured mostof, or all, or a number of according to application criteria, the subsetrelations, whether they are explicit, implicit or derived. Moreover,subset relations could give rise to equality relations. These equalityrelations (case 3) should be captured and dealt with accordingly.

3. (equality) Let A, Bϵ∈. If A=B, then we can remove B from ∈ andreplace all occurrences of B by A, and all occurrences of B′ by A′.Alternatively, A=B can be replaced by |A|=|B|, and either A⊆B or B⊆A.(If either A⊆B or B⊆A is implied by other constraints, then replace A=Bwith |A|=|B|.) A=B′ is equivalent to {A, B} is a partition of U, whichis Case 4 below. Furthermore, A=Ø can be replaced by |A|=0, and A=U canbe replaced by |A|=1.

4. (partition) {A₁, A₂, . . . , A_(κ)} is a partition of A₀ if and onlyif the following conditions hold:

-   -   a. A_(i)∩A_(j)=Ø for all 1≤i, j≤k and i≠j;    -   b. A_(i)⊆A₀ for 1≤i≤k; and    -   c. |A₀|=|A₁|+|A₂|+ . . . +|A_(k)|.

Hence, all partition relations can be replaced by disjoint relations andsubset relations, plus imposing certain conditions on the measure m.

5. (A∩B=C) If A∩B=C≠Ø, then extend ∈ to include the objects G₁ and G₂,where |G₁1|=|A|-|C| and |G₂|=|B|-|C|. In essence, we are trying todecompose A and B into the sets A=G₁∪C and B=G₂∪C. Clearly, A∩B=C iscompletely characterized by the values assigned to |G₁| and |G₂|, andthe facts that {C, G₁} is a partition of A, and {C, G₂} is a partitionof B.

6. (A∪B=C) If A∪B=C, then extend ∈ to include the objects G₀, G₁ and G₂,where |G₀|=|A|+|B|-|C|, |G₁|=|A|-|G₀|, and |C₂|=|B|0 |G₀|. In essence,we are trying to decompose A and B into the sets G₀=A∩B, G₁=A-G₀ andG₂=B-G₀. Clearly, A∪B=C is completely characterized by the valuesassigned to |G₀|, |G₁| and |G₂|; the three sets G₀, G₁ and G₂ arepairwise disjoint; and the facts that {G₀, G₁} is a partition of A, and{G₀, G₂} is a partition of B. In view of Cases 1 and 2 above, the caseC=U is of particular significance.

7. (intersection) A₀=A₁∩A₂∩A_(k) is equivalent to A₁∩A₂∩ .. .∩A_(k)∩A₀′=Ø. In view of the above result, only the case k=2 will beconsidered. This is equivalent to Case 5 above.

8. (union) A₀=A₁∪A₂∪ . . . ∪A_(k) is equivalent to A₀′=A₀′∩A₂′ ∩ . . . .∩A_(k)′. Therefore, only the case k=2 will be considered. This isequivalent to Case 6 above.

9. (proper subset) Let A, Bϵ∈. A⊂B if and only if A⊆B and |A|<|B|.

Observe that:

-   -   constraints of the form A⊆B∪C and of the form A∩B⊆C, if not        derivable from other constraints, may cause {tilde over (m)} to        be ill-defined;    -   constraints of the form B∪C⊆A can be characterized using only        subset relations; and    -   constraints of the form C⊆A∩B can be characterized using only        subset relations.

The above discussions show how various constraints or relations can betransformed into constraints or relations of the form A∩B=Ø and/or A⊆B,where A, B ϵ∈.

Section 11. Graph Representations

FIG. 11 is a flow diagram for generating graph representations,according to an embodiment of the present invention. Actually FIG. 11presents 2 graphs representations which are used in subsequentdiscussions—admissibility and compatibility. Both representations aredescribed in this Section. In the previous section, we showed thatconstraints or relations that can be transformed into constraints orrelations of the form A∩B=Ø and/or A⊆B, where A, B ϵ∈ are, for example,necessary for the extension to be well-defined. In this section, weshall show that these conditions might not be sufficient. In order toget a better understanding of the structures of collections ofconstraints, we shall also represent collections of constraints usinggraphs.

Let ∈

2^(U). An acceptable constraint or relation for ∈ is a constraint orrelation of the form A∩B=Ø or of the form A⊆B, where A, B ϵ∈. Moreover,let

be a collection of constraints for ∈

2^(U).

is weakly acceptable if and only if every constraint in

is acceptable.

Clearly, ⊆ is transitive. Moreover, if A⊆B and B∩C=Ø, then A∩C=Ø.Therefore, there might be constraints or relations not in

but implied by the constraints or relations in

.

Let

be a collection of constraints for ∈

2^(U).

is maximal if and only if all acceptable constraints implied by

are also in

. Moreover,

_(X) will denote the smallest weakly acceptable collection ofconstraints that contains

and is maximal.

Algorithm 1. (Construction of 

 _(X)) Let 

 be a weakly acceptable collection of constraints for ε 

 2^(U). 1. Let 

 _(X) =  

 . 2. Let 

 ₀ be the subset of 

 which contains all the subset relations of 

 , i.e., (A ⊆ B) ∈ 

 ₀ if  and only if (A ⊆ B) ∈ 

 . 3. Construct the transitive closure of 

 ₀ using any standard method, for example, Alfred V.  Aho, John E.Hopcroft, and Jeffrey D. Ullman. The Design and Analysis of Computer Algorithms. Addison Wesley, 1974. 4. Enlarge 

 _(X) to include all members in the transitive closure of 

 ₀ 5. For each (A ⊆ B) ∈ 

 _(X), let 

 _(A⊆B) be the set of all relations of the form (B ∩ C = Ø) ∈ 

 _(X).  For each relation (B ∩ C = Ø) ∈ 

 _(A⊆B), add to 

 _(X) the relation (A ∩ C = Ø).

Let

be a weakly acceptable collection of constraints for ∈

2^(U).

-   -   1. The directed graph associated with (∈,        ), denoted by        _(∈)(        ), is the directed graph (∈,        ), where        ={(A, B)|A, B ϵ∈, either (A∩B=Ø) or (A⊆B) ϵ        }.    -   2. The SR-graph associated with (∈,        ) is the directed graph        =(∈,        ) where        ={(A, B)|A, B ϵ∈, (A⊆B)ϵ        }.

Let

be a weakly acceptable collection of constraints for ∈

2^(U), and let

be the SR-graph associated with (∈,

). Clearly, every directed cycle in

is equivalent to some B₁ ⊆ B₂ ⊆ . . . ⊆B_(k), where B₁=B_(k), andvice-versa. This implies that B₁=B₂= . . . =B_(k). Due to the aboveobservation, the following algorithm can be used to remove one or moreof (e.g., all) equality relations and one or more of (e.g., all) subsetrelations in C that give rise to equality:

Algorithm 2. Removal of Equality Relations) Let 

 be a weakly acceptable collection of constraints for ε 

 2^(U), and 

 be the SR-graph associated with (ε, 

 ). 1. Remove both A ⊆ B and B ⊆ A if both are in 

 . 2. Determine all directed cycles in 

 (this can be done using standard algorithms, for  example, depth-firstsearch algorithm discussed in Alfred V. Aho, John E. Hopcroft, and Jeffrey D. Ullman. The Design and Analysis of Computer Algorithms.Addison Wesley,  1974). 3. Remove all the subset relations from 

 which appear in the directed cycles. 4. Process the equality relationsobtained in the Steps 1, and/or 2 using the methods given  above on howto handle equality.

Given a weakly acceptable collection of constraints

for ∈

2^(U).

is acceptable if and only if

is maximal, ∈ϵ∈, and no equality relation between any two members of ∈can be derived from the constraints in

.

Let

be an undirected graph and let C be a cycle in

. C is pure if and only if C is simple, has length>3, and contains nosubcycles. If

is a directed graph and Can undirected cycle in

, then C is a pure cycle of

if and only if C is a pure cycle of |

|.

Let

be an acceptable collection of constraints for

2^(U). If there exists a pure cycle in |

_(∈)(

)|, then the extension of m is not well-defined (it is not dependentsolely on the individual probabilities).

Section 12. Admissibility

FIG. 12 is a flow diagram for checking for admissibility, according toan embodiment of the present invention. We have shown in the previoussection that for m to have a well-defined extension, not only is itnecessary that the collection

of constraints be acceptable, but also that the undirected graph thatrepresents

cannot have any pure cycle. The main result of this section asserts thatthis is not only a necessary condition, but it is also a sufficientcondition.

Let

be a directed or undirected graph.

is admissible if and only if

has no pure cycles. Moreover, let

be an acceptable collection of constraints for ∈

2^(U).

is admissible if and only if

_(£)(

) is admissible.

Clearly, if

is empty, then

is admissible.

is empty means the elements in ∈ are not subject to any explicitconstraints.

Let

=(V,

) be an undirected graph and

a total ordering on V.

is permissible on

if and only if for every A, B, C ∈V, if {A, B}, {A, C} ∈

and {B, C} Ε

, then either A

B or A

C. Let

=(V,

) be a tree with root T.

-   -   Let AϵV.        (A), the depth of A over        , is the length of the path from T to A. In particular, d(T)=0.    -   Let AϵV.        (A) is the set of all children of A in        .    -   is the total ordering on V, where for all A, B ϵV, A        B if and only if either d(A)<d(B), or d(A)=d(B) and A lies to        the left of B.    -   Let AϵV.        (A) is the set of all BϵE V where B=A; d(A)=d(B) and A        B; or B is the child of some E in        where d(E)=d(A) and E        A.

Let

=(V, ,

) be a forest, i.e., a collection of disjoint trees, and

a total ordering on V.

is compatible with

if and only if for every A, B ϵV, A

B whenever A

B for some tree

contained in

.

Let

=l(V,

) and

=(V₀,

) be directed or undirected graphs.

is a maximal directed or undirected subgraph of

if and only if V₀⊆V, and for every A, B ϵV₀, {A, B} ϵ

₀ if and only if {A, B}ϵ

.

Let

=(V,

) be an undirected graph and

a total ordering on V. The following algorithm provides a constructivedefinition for the concept of recursive breadth-first search (RBFS)forest for

based on

. An RBFS forest can be constructed recursively using recursivebreadth-first search. In the following algorithm, the nodes in V areinitially unmarked, and marking of nodes is a global operation.

Algorithm 1. (Construction of RBFS forest) Given an undirected graph 

 = (V, 

 ) where V ≠ Ø, and 

 a total ordering on V. 1. Let Q be a queue which is initially empty. 2.Let 

 be a forest which is initially empty. 3. While there are unmarked nodesin V, do the following:  a)  Let T be the unmarked node in V thatprecedes all other unmarked nodes in V under  

 .  b)  Mark T, and enqueue T into Q.  c)  Add to 

 the tree 

 which includes the node T (The root of 

 is T).  d)  Add to 

 the tree 

 which includes the node T (The root of 

 is T).  i. Dequeue from Q and let E be the node dequeued.  ii. Let S bethe set consisting of all unmarked nodes in V which are neighbor of E in

 . iii. If S is not empty, do the following: A. Let 

 ₀ = (S, 

 ₀) be a maximal undirected subgraph of 

 . B. Call Algorithm 1 with input 

 ₀ and let 

 ₀ be the output from the algorithm. C. Let 

 _(S) be a total ordering on S compatible with 

 ₀ and order S according to

 _(S). D. Make the nodes in S children of E and arranged the children inthe prescribed order, i.e., according to 

 _(S). E. Append S to Q in the prescribed order, i.e., according to 

 _(S). 4. Output 

 .

Let #=(V,

) be an undirected graph and let

be a total ordering on V. Let

₀ be a total ordering on V.

is an RBFS-ordering on

based on

₀ if and only if

is compatible with the RBFS forest for

based on

₀. Moreover,

is an RBFS-ordering on

if and only if there exists a total ordering

₀ on V such that

is an RBFS-ordering on

based on

₀.

Let

be an acceptable collection of constraints for ∈

2^(u). The following statements are equivalent:

-   -   1.        is admissible.    -   2.        _(∈)(        )| is admissible.    -   3. |        _(∈)(        )| contains no pure cycle.    -   4. There exists a total ordering on ∈ which is permissible on |        _(∈)(        )|.    -   5. Any RBFS-ordering on |        (        )| is permissible on |        _(∈)(        )|.

Any of the criteria above can be used to test and ensure that

is admissible.

We shall now provide a polynomial algorithm to determine whether or nota given acceptable collection of constraints for ∈

2^(U) is admissible, based on 5 above. It is clear that RBFS forest andRBFS-ordering can play a central role in the following algorithm.

Algorithm 2. (Checking for Admissibility) Let 

 be an acceptable collection of constraints for ε 

 2^(U). 1. Construct 

 = | 

 _(ε)( 

 )|. 2. Let 

 be any total ordering on ε. Construct the RBFS-ordering 

 over  

 based on 

 using Algorithm 1. 3. For i = 1 to i = |ε|, do the following:  a)Determine 

 .  b) For every A, B ∈ 

 , determine whether or not {A, B} is in 

 . If {A, B} is not in 

 , stop. 

 is not admissible. Otherwise, continue. 4. Stop. 

 is admissible.

Section 13. Compatibility

FIG. 13 is a flow diagram of checking for compatibility, according to anembodiment of the present invention. Sufficient conditions for m and

to be compatible are given in this section. An optimal method foradjusting the values of a given measure to make it compatible with theconstraints is also provided.

Let C be an acceptable collection of constraints for ∈

2^(U) and V⊆∈. B is well-nested under C if and only if for every A, BϵB,r_(c)(A, B) is defined, i.e., any pair of sets in B are either disjointor one of the set is a subset of the other. In addition, if B bewell-nested under C and T=(V, R) is a tree, then T is a well-nested treeof B if and only if V=B∪{U}, and for every A,BϵV, A is the parent of Bin T if and only if A⊇B.

Let

be an admissible collection of constraints for ∈

2^(U),

=|

_(∈)(

)|,

a RBFS-ordering over

based on some total ordering on ∈, and 1≤|∈|. The following algorithmprovides constructive definitions for

and

.

Algorithm 1. (Construction of 

 and 

 ) Let 

 be an admissible collection of constraints for ε 

 2^(U) and 

 a total ordering on ε. 1. Construct 

 = | 

 _(ε)( 

 )|. 2. Construct the RBFS-ordering 

 over 

 based on 

 using Algorithm 1. 3. Order the nodes in 

 according to 

 . Say E₁, E₂, ... E_(m). 4. For k = 1 to k = m, do the following:  (a) Determine 

 (E_(k)).  (b)  Construct a well-nested tree 

 for 

 (E_(k)).  (c) Let 

 be the parent of E_(k) in 

 .  (d) Let 

 be the collection consisting of all siblings of E_(k) in 

 .  (e) Let 

 be the collection consisting of all children of E_(k) in 

 . 5. Return the following sequences:  (a)  

 , ... , 

 .  (b)  

 , ... , 

 .  (c)  

 , ... , 

 .

Let

be an admissible collection of constraints for ∈

2^(U), where ∈ is ordered by a RBFS-ordering

over

based on some total ordering on ∈, m a measure over ∈, and 1k≤|∈|.

•m_(p, 𝒢, ≺)^(k) = m(ω_(p, 𝒢, ≺)^(k))•m_(s, 𝒢, ≺)^(k) = ∑_(E ∈ Ω_(s, 𝒢, ≺)^(k))m(E)•m_(c, 𝒢, ≺)^(k) = ∑_(E ∈ Ω_(c, 𝒢, ≺)^(k))m(E)

Let

be an admissible collection of constraints for ∈

2^(U), and m a measure over ∈. m is super additive over

if and only if m(E)≥Σ_(E) ₀ _(ϵ∈) ₀ m (E₀) for every E ϵ∈ and E₀⊆∈,where all members of ∈₀ are pairwise disjoint and E₀⊆ E for every E₀ϵ∈₀.

As shown in the following theorem, there are more than one method totest for admissibility—super additivity, permissible total ordering,etc.

Let

be an admissible collection of constraints for ∈

2^(U) and let m be a measure over ∈. The following statements areequivalent:

-   -   1. m and        are compatible.    -   2. m is super additive over        .    -   3. There exists a total ordering        on ∈ which is permissible on        =|        _(∈)(        )|, such that for all 1≤i≤m,        ≤m(E_(i))≤m_(p,lG,)        -        where ∈={E₁, E₂, . . . , E_(m)} is ordered according to        .    -   4. For all total ordering        on ∈ which is permissible on        =        _(∈)(        )|, and for all 1≤i≤m,        ≤m(E_(i))≤        -        where ∈={E₁, E₂, . . . , E_(m)} is ordered according to        .

Criteria 2, 3 or 4 above together with Algorithm 1 can be used to testand ensure that m and

are compatible.

Let

be a collection of constraints for ∈

2^(U), and let ∈₀⊆∈,E₀ ϵ∈∪{U}. ∈₀ is a disjoint group of E₀ under

if and only if E⊆E₀ for every Eϵ∈₀, and all the members of ∈₀ arepairwise disjoint under

. Moreover, ∈₀ is a maximal disjoint group of E₀ under

if and only if ∈₀ is a disjoint group of E₀ under

, and for every ∈₁⊆∈ where ∈₁⊂∈₀, ∈₁ is a NOT a disjoint group of Eunder

.

FIG. 14 is a flow diagram of guaranteeing compatibility, according to anembodiment of the present invention.

Problem 1. (Minimization Problem for Compatibility) Let

be a collection of constraints for ∈

2^(U), and m a measure over ∈.

-   -   1. For each Eϵ∈∪{U}, let x_(E) be a real variable where 0≤m(E).    -   2. For each ∈₀⊆∈ and E₀ ϵ∈ where ∈₀ is a maximal disjoint group        of E₀ under        , introduce the linear constraint Σ_(Eϵ∈) ₀ x_(E)≤x_(E) ₀ .    -   3. Minimize the objective function Σ_(Eϵ∈)(m(E)-²,

Problem 1 is a quadratic programming problem (Alexander Schrijver.Theory of Linear and Integer Programming. John Wiley, 1998) andtherefore can be solved using standard techniques.

Let

be a collection of constraints for ∈

2^(U), and m a measure over ∈. Let x_(E) be a solution of Problem 1, andlet m₀(E)=x_(E) for every Eϵ∈. Then m₀ is the measure over ∈ induced bym.

The above definition can be extended to partial measures. A partialmeasure m over ∈ is a measure over some subset of ∈. In this case, weextend m such that m(E)=1 for all Eϵ∈ which are not in the domain of m.

There are many other ways to guarantee that

and m are compatible, besides solving Problem 1. Nevertheless, themeasure m₀ above is the measure closest to m, in the least square sense,without exceeding it. It should be noted that adjustment of the valuesof any measure might need to be done only sparingly to ensure theintegrity of

and m.

Section 14. Preprocessing and Extension

FIG. 15 is a flow diagram of preprocessing, according to an embodimentof the present invention. In order to make the computation of {tildeover (m)}(E), where Eϵ{tilde over (∈)}, more efficient, an algorithm forpreprocessing ∈,

and m is given. The preprocessing algorithm also determines whether ornot

is admissible, and whether or not m and

are compatible.

Algorithm 1. (Preprocess ε, 

 and m) Let 

 be an acceptable collection of constraints for ε 

 2^(U), 

 a total ordering on ε, and m a measure over ε. 1. Construct 

 = | 

 _(ε)( 

 )|. 2. Construct the RBFS-ordering 

 over 

 based on 

 using Algorithm 1. 3. Order the nodes in 

 according to 

 . Say E₁, E₂, ... E_(m). 4. For k = 1 to k = m, do the following:  a)Determine 

 (E_(k)).  b) For every A, B ∈ 

 (E_(k)), determine whether or not {A, B} is in 

 . If {A, B} is not in 

 , stop. 

 is not admissible. Otherwise, continue.  c) Construct a well-nestedtree 

 for 

 (E_(k)).  d) Determine 

 , and 

 . (see Algorithm 1)  e) Test whether or not the conditions forcompatibility are satisfied. If any of the conditions are violated,stop. 

 is not compatible with m. Otherwise, continue.  f) Let 

 (E_(k)) = [m(E_(k)) − 

 ] ÷ [ 

 − 

 − 

 ]. 5. Return 

 .

The preprocessing of ∈,

and m, combines all the constraints in

and produces a new function

. Clearly,

is a measure over ┐.

Let

be an admissible collection of constraints for ∈={E₁, E₂, . . . , E_(m)}

2^(U), where ∈ is ordered by some total ordering

on ∈ which is permissible on |lG_(∈)(

)|. If Eϵ∈, then define

(E′)=1-

(E).

Let

2^(U).

=B∪{B′|BϵB}. Moreover, let ∈

2^(U). B is coset of ∈ if and only if B⊆∈ and for every Eϵ∈, not both Eand E′ are in B.

Let

be an admissible collection of constraints for ∈={E₁, E₂, .. . , E_(m)}

2^(U), where ∈ is ordered by some total ordering

on ∈ which is permissible on |G_(∈)(

)|, and B a coset of ∈. The

-cover

(B) of B under

is the smallest coset of ∈ satisfying the following conditions:

-   -   1. B⊆        (B);    -   2. If B=E_(k) and there exists B₀ ϵ        (B) where B is the parent of B₀ with respect to the well-nested        tree        for        (E_(k)), then B ϵ→        (B); and    -   3. If B=E_(k) and there exists B₀ ϵ        (B) where B is a sibling of B₀ with respect to the well-nested        tree        for        (E_(k)), then B′ ϵ        (B).

Let

be an admissible collection of constraints for ∈=l{E₁, E₂, . . . ,E_(m)}

, where ∈ is ordered by some total ordering

on ∈ which is permissible on |lG_(∈)(

)|, m a measure over ∈ which is compatible with

, and

B a coset of ∈. Let |(

) be the intersection of all the sets in

. If |(

)≠Ø, then for every permissible total ordering

on ∈:

${\overset{\sim}{m}( {I(\mathcal{B})} )} = {{\overset{\sim}{m}( {\bigcap\limits_{B \in {\Lambda_{\mathcal{G}, \prec}(\mathcal{B})}}(B)} )} = {\prod\limits_{B \in {\Lambda_{\mathcal{G}, \prec}(\mathcal{B})}}{{m_{\mathcal{G}, \prec}(B)}.}}}$

In particular, if

=Ø, then

${\overset{\sim}{m}( {I(\mathcal{B})} )} = {{\overset{\sim}{m}( {\bigcap\limits_{B \in \mathcal{B}}(B)} )} = {\prod\limits_{B \in \mathcal{B}}{{m_{\mathcal{G}, \prec}(B)}.}}}$

The above result provides a method for determining {tilde over (m)}(l(

)). Clearly, under

, the relevant sets in ∈ are basically stochastically independent. Thisis the reason why we named our method constraint stochastic independencemethod. Furthermore, {tilde over (m)}(I(

)) can be determined in 0(|

|²)-time after preprocessing.

In general, if Gϵ{tilde over (∈)}, can express G as union ofintersections of sets, such as disjunctive normal form. If all the termsin the expression are pair-wise disjoint, {tilde over (m)}(G) can bedetermined by applying the above result to each term and then summingthem.

Section 15. Unification Algorithms

In this section, we provide several algorithms for determining ν_(k)(F).

Let

be a collection of propositions.

={P′|Pϵ

} and

=

∪

.

is the collection of all

P where Q⊆

. For completeness,

P=F if

=Ø.

is the smallest collection of propositions containing

and close under negation, conjunction and disjunction.

Let

be a collection of atomic propositions.

is a coset of

if and only if

⊆

and for every Pϵ

, not both P and P′ are in

. Let Lϵ

. L is d-simple if and only if L=

P for some coset

of

.

Without loss of generality, we shall assume that L is d-simple wheneverLϵ

.

An AKB κ is disjunctive if and only if there exists a finite collection

of atomic propositions such that Aϵ

for every (E→A) E K.

We shall assume that

is the smallest collection of atomic propositions that satisfies theabove definition.

The following show how to transform an arbitrary AKB into a disjunctiveAKB.

Algorithm 1. (Disjunctive AKB) Given an AKB κ. 1. For each ω ∈ κ, do thefollowing:  (a) Express r(ω) as L₁ ∧ L₂ ∧ ··· ∧ L_(k), where L_(i) ∈ 

 and L_(i) is simple for each i = 1, 2, ..., k.  (b) For each i = 1, 2,..., k, add to κ the object l(ω) → L_(i).  (c) Remove ω from κ. 2.Return κ.

Let κ be an AKB. κ is irreducible if and only if κ satisfies thefollowing conditions:

1. For every ωϵκ, l(ω)≠Øand τ(ω)≠T. 6p 2. κ is disjunctive.

3. For every ω₁, ω₂ϵκ, if τ(ω₁)=τ(ω₂), then l(ω₁)=l(ω₂).

4. For every ω₁, ω₂ϵκ, if τ(ω₁)⇒τ(ω₂), then l(ω₁)⊆l(ω₂).

The following show how to transform an arbitrary AKB into an irreducibleAKB.

Algorithm 2. (Irreducible AKB) Given an AKB κ. 1. Remove all ω from κwhere l(ω) = Ø or r(ω) = T. 2. Transform κ into a equivalent disjunctiveAKB using Algorithm 1. 3. Let κ = {ω_(κ) ^(r)|ω ∈ κ}, where ω_(κ) ^(r) =V_(μεκ, r(μ)=r(ω)) μ. 4. For every ω ∈ κ, let κ_(ω) = {μ ∈ κ|ρ(r(μ)) ⊆ρ(r(ω))}. 5. For every ω ∈ κ such that |κ_(ω)| > 1, replace ω withV_(μεκ) _(ω) μ. 6. Return κ.

Let κ be an irreducible AKB.

Ω _(k)={E→A|E⊆U, Aϵ

}, Ω̆={E→A|E⊆U,A,ϵ

} and {tilde over (Ω)}×_(k)={E→A|E⊆U,Aϵ

}.

Let Ø=Ω⊆{tilde over (Ω)}_(k). Then ∧_(wϵΩ)ω=(U→T) and ∨_(wϵΩ)ω=(Ø→F).

Let ωϵ{tilde over (Ω)}_(k). p107 ) is the collection of all P ϵ

that occur in τ(ω), and |ω|=|o(ω)|. Let

⊆

ω/

is obtained from ω by removing from τ(ω) all Pϵ

. If p(ω) ⊆

, then ω/

=(l(ω)→F). Moreover, if

={P}, then ω/P=6107 /

.

Let λ⊆κ and

⊆

.

-   -   λ        ={μκλ|        ⊆p(μ)}, λ        ={μϵλ|        ∩p(μ)=Ø}, and λ/        ={μ/        |μϵα}.    -   If        ={P}, then λ_(p)=λ_(p), λ _(p)=λ        and λ/P=λ/        .

Let κ be an irreducible AKB and Pϵ

P is a useless symbol if and only if κ_(p)=Ø or κ_(p′)=Ø. Moreover, κ isuseful if and only if P is not a useless symbol for all Pϵ

_(κ).

The following show how to transform an irreducible AKB into a usefulAKB.

Algorithm 3. (Useful AKB) Given a irreducible AKB κ. 1. While thereexists a useless symbol P ∈ 

 let κ = κ _({P, P′}). 2. Return κ.

Let κ be an irreducible AKB, ω₁, ω₂ ϵΩ̆_(k) and Pϵ

_(k). P is complemented with respect to ω₁ and ω₂ if and only if eitherPϵp(ω₁) and P′ϵp(ω₂), or P′ ϵp(ω₁) and P ϵp(ω₂).

Let κ be an irreducible AKB and ω_(l), ω₂ ϵΩ̆_(k). c(ω₁, ω₂) is thecollection of all Pϵ

_(κ)where P is complemented with respect to ω₁ and ω₂.

Let P ϵ

.

-   -   ω₁˜^(P)ω if and only if P ϵp(ω₁), P′ϵp(ω₂) and c(ω₁/P,ω₂/P′)=Ø.    -   ω₁˜ω₂ if and only if ω₁˜^(P)ω₂ for some P ϵ        . In other words, ω₁˜ω₂ if and only if |c(ω₁, ω₂)|=1.    -   If ω₁˜^(P)ω₂ where P ϵ        then ω₁⋄ω₂=(l(107 ₁)∩l(ω₂)→τ(ω₁/P) vτ(ω₂/P′)).

Otherwise, ω₁⋄ω₂ is not defined. In other words, ω₁⋄ω₂ merges ω₁ and ω₂and removes both P and P′. In this case, we say that ω₁ and ω₂ aremergeable (300)

-   -   Γ(κω)=(κ-{μϵκ|τ(μ)=τ(ω)})∪{V_(μϵκ,τ(μ)=τ(ω))μ}.

FIG. 4 is a flow diagram of an example unification algorithm, accordingto an aspect of an embodiment of the invention. FIG. 4 also emphasizes anew unification operator for reasoning under uncertainties and/orincompleteness capable of managing the rules and the sets of evidencestogether.

Algorithm 4. (Unification Algorithm) Given a irreducible AKB κ. 1. Dowhile there exists an unmarked pair (ω_(1,) ω₂), where ω_(1,) ω₂ ∈ κ andω₁ ~ ω_(2:)  (a) Let κ = Γ(κ, ┌₁ ⋄ ω₂).  (b) Mark (ω_(1,) ω₂). 2. Let 

 = {G ⊆ U|(G → F) ∈ κ}. If 

 ≠ Ø, then return ∪_(G∈) 

 G. Otherwise, return Ø.

Let κ be an irreducible AKB and P ϵ

Since σ_(κ)(F)=σ_(κ)(P) ∩σ_(κ)(P′), therefore σ_(κ)(F) can be determinedrecursively on

Although this requires the computations of both σ_(κ)(τ(ω)) andσ_(κ)((τ(ω)′)), the two can be carried out in an integrated manner.Moreover, it can be used to reduce the size of κ.

The following show how to construct κ|P given an irreducible AKB κ, andP ϵ

.

Algorithm 5. (Construction of κ|P) Given a irreducible AKB κ, and P ∈ 

 . Return κ|P 1. Do while there exists an unmarked pair (ω_(1,) ω₂),where ω₁ ∈ κ_(P), ω₂ ∈ κ_(P′), and ω₁ ~^(P) ω₂  (a) Let κ = ┌(κ, ω₁ ⋄ω₂).  (b) Mark (ω_(1,) ω₂). 2. Return κ ^({P, P′}) .

Algorithm 5 can be used as a basis for a unification algorithm as shownbelow:

Algorithm 6. (Unification Algorithm based on κ|P) Given a irreducibleAKB κ. Return G ⊆ U. 1. While there exists P ∈ 

 , apply Algorithm 5 with input κ and P, and let κ = κ|P. 2. Let 

 = {G ⊆ U|(G → F) ∈ κ}. If 

 ≠ Ø then return 

 G. Otherwise return Ø.

By virtue of the above results, ω₁⋄ω₂, where ω₁˜ω₂, can be used as thebasic operation in any unification algorithm for determining σ_(κ)(F).(See FIG. 4 ) The operator ⋄ is capable of performing deduction underuncertainties and/or incompleteness and is therefore a significant andnatural generalization of the classical deduction process. Step 5 ofAlgorithms 4 and 6 can be intentionally left vague so that any orderingof (ω₁, ω₂), where ω₁, ω₂ϵκ and ω₁˜ω₂, and any ordering of

, can be used. As a matter of fact, many heuristics can be applied tovarious unification algorithms, e.g., Algorithms 4 and 6, to make themmore efficient.

If necessary, the algorithms below can be used to reduce the size of κ,and therefore can be used to devise approximate algorithms to determineσ_(κ)(L).

Algorithm 7. (Processing Rules with length ≤ q) Given a irreducible AKBκ and a non-negative integer q. Return κ^((q)). 1.  Apply Algorithm 3 totransform κ into an equivalent useful AKB. 2.  While there exists anunmarked pair (ω₀, ω₁) where ω₀, ω₁ ∈ κ  such that |ω₀| ≤ q and ω₀ ~ ω₁,do the following:  (a) Let κ = ┌(κ, ω₀ ⋄ ω₁).  (b) Mark the pair (ω₀,ω₁). 3 Return κ.

Let κ be an irreducible AKB, ω₁, ω₂ ϵκ, and ω₁˜ω₂. d(ω₁,ω₂)=|ω₁⋄ω₂|-max(|ω₁|, |ω₂)|.

Algorithm 8. (Processing Rules with d(ω₁, ω ₂) ≤ q) Given a irreducibleAKB κ and integer q ≥ −1. Return κ^([q]). 1. Apply Algorithm 3 totransform κ into an equivalent useful AKB. 2. While there exists anunmarked pair (ω₁, ω₂) where ω₁, ω₂ ∈ κ such that d(ω₁, ω₂) ≤ q,  do thefollowing:  (a) If ω₁ ~ ω₂, replace κ by Γ(κ, ω₁ ⋄ ω₂).  (b)  Mark thepair (ω₁, ω₂). 3. Return κ.

The embodiments of the present invention can provide one or more of thefollowing:

A computer-based method for constructing, reasoning, analyzing andapplying knowledge bases with uncertainties and/or incompleteness. Thesystem/model will be referred to as augmented knowledge base (AKB): (SeeSection 1)

-   -   (a) The objects in an AKB are of the form E→A, where A is a        logical sentence in a first-order logic, a proposition or a rule        in the traditional sense, and E is a set corresponding to the        body of evidences that supports A.    -   (b) No restrictions are imposed on E→A. For example, it is        possible to have both E₁→A and E₂→A in the AKB κ. This means two        or more sets of evidences can be used to support the same rule,        thereby allowing, among other things, multiple experts/sources        to be involved in designing a single AKB.    -   (c) Relationships among the bodies of evidences can be specified        explicitly. In other words, constraints may be imposed on the        bodies of evidences.    -   (d) A mapping associates each set (body of evidences), not each        rule, with a value.    -   (e) In an AKB, evidences (the E's) and rules (the A's) represent        completely different objects and each play an essential but        separate role in the operations of an AKB.

FIG. 3 provides the basic mechanism for deductive reasoning on an AKB,according to an embodiment of the present invention. The basic mechanismassociated with deductive reasoning on an AKB κ is

${G{\overset{d}{arrow}}_{\kappa}L},$where G is a body or evidences and L is a target rule. The supportσ_(κ)(L) of L with respect to κ, and the plausibility σ _(κ)(L) of Lwith respect to κ can all be determined using

$G{\overset{d}{arrow}}_{\kappa}{L.}$. (See Section 1)

Computation of σ_(κ)(L) and σ _(κ)(L), the support and plausibility ofL, where L is a target rule, using σ_(κ)(L)=σ_(κ) ₀ (F) whereκ₀=κ∪{U→L′}; and σ _(κ)(L)=σ_(κ) ₁ (F) where κ₁=κ∪{U→L}. (See Section 1). This provides a universal method for computing the support andplausibility of L, for any L.

A new unification operator ω₁⋄ω₂ which is applicable to knowledge baseswith uncertainties and/or incompleteness. (See Unification AlgorithmsSection). This new unification operator extends and generalizes theclassical unification operators. The classical unification operators arenot capable of dealing with uncertainties and/or incompleteness. (Seealso FIG. 4 )

Several unification methods/algorithms for determining σ_(κ)(F). (SeeUnification Algorithms Section). FIG. 4 is a flow diagram of an exampleunification algorithm, according to an aspect of an embodiment of theinvention.

Functions or mappings representing the strength or validity of the bodyof evidences associated with a given rule, as well as, associated withall relevant rules. They are referred to as measure. (See Section 2)

AKBs can use the extension scheme constraint stochastic independencemethod given above and/or any other extension schemes, like those givenin Section 2. Moreover, AKB can select more than one scheme at a time,and applied them either simultaneously and/or conditionally.

FIG. 6 is a flow diagram of deductive inference in AKB, according to anembodiment of the present invention. Method for performing deductiveinference in an AKB—done in two separate steps: Given AKB κ (350) andtarget rule L (355). (See Sections 2 and 4).

-   -   (1) Form the body of evidences that supports L. This can be        accomplished by applying any of the Unification Algorithms given        (360).    -   (2) Determine the value associated with the body of evidences        obtained in the first step.

This depends on the extension scheme(s) selected for κ (362 and 364).

AKB encompasses most of the well-known formalisms of knowledge basestogether with their associated reasoning schemes, includingprobabilistic logic(N. J. Nilsson. Probabilistic logic. ArtificialIntelligence, 28:71-87, 1986), Dempster-Shafer theory (G. Shafer. AMathematical Theory of Evidence. Princeton University Press, 1976),Bayesian network( Judea Pearl. Probabilistic Reasoning in IntelligentSystems: Networks of Plausible Inference. Morgan Kaufmann, 1988),Bayesian knowledge base (Eugene Santos, Jr. and Eugene S. Santos. Aframework for building knowledge-bases under uncertainty. Journal ofExperimental and Theoretical Artificial Intelligence, 11:265-286, 1999),fuzzy logic (L. A. Zadeh. The role of fuzzy logic in the management ofuncertainty in expert systems. Fuzzy Sets and Systems, 11:199-227, 1983;L. A. Zadeh. Fuzzy sets. Information and Control, 8:338-353, 1965. 41;R. Yager. Using approximate reasoning to represent default knowledge.Artificial Intelligence, 31:99-112, 1987), numerical ATMS (JohanDeKleer. An assumption-based TMS. Artificial Intelligence, 28:163-196,1986), incidence calculus (Alan Bundy. Incidence calculus: A mechanismfor probabilistic reasoning. Journal of Automated Reasoning,1(3):263-283, 1985; Weiru Liu and Alan Bundy. Constructing probabilisticatmss using extended incidence calculus. International Journal ofApproximate Reasoning, 15(2):145-182, 1996), etc. We illustrated how torecast the first four as AKBs (see Section 3), so that they can beexpanded, and results obtained for AKBs can be applied directly to them.

Present several important properties of AKBs and provide methods forchecking, applying and/or managing them. The properties include:consistency, completeness, perfectness, contribution, i-consistency,i-completeness, i-perfectness, monotonicity, contribution,vulnerability, possible deception, etc. (see Section 5 and 6). (Many ofthese properties, such as contribution, vulnerability, etc. areapplicable to any complex systems.)

Methods to enable AKBs to perform inductive inferences and extractmeaningful new knowledge from AKBs as long as certain consistencyconditions are satisfied (see Section 6). One such method of performinginductive inference in an AKB is as follows: Given AKB κ and Lϵ

.

-   -   (a) Construct κ and L′.    -   (b) Perform deductive inference on κ and L′.

Methods to construct consistent higher order AKBs, including

_(κ) ⁰,

_(κ) ¹, and different varieties of

_(κ) ², to serve as extensions of any inconsistent AKB (see Section 7).

Present a new system/model named free-form database (FFDB) and providemethods for transforming relational databases and deductive databasesinto FFDBs, i.e., augmented relational database and augmented deductivedatabase, which are special cases of AKBs (see Section 8). This providesa new way of dealing with relational databases and deductive databases,as well as expanding their capabilities.

Using FFDB, new system/model named augmented inductive database ispresented which can be used to extract new information from relationaldatabases (see Section 8). This provides a general, as well asintegrated method for performing data mining.

A method for extending any probabilistic measure via the new notion of ∈

2^(U) and by taking the average of all allowable sets. This method isreferred to as constraint stochastic independence (CSI) method (seeSections 9). This method has a clear probabilistic semantics, andtherefore not subject to the anomalies that are usually associated withother extension schemes.

Provide a new computational scheme for reasoning under uncertaintiesand/or incompleteness based on constraint stochastic independence method(see Section 10).

Present allowable constraints and provide methods to transform standardrelations among sets into allowable constraints (see Section 10).

Provide a graph representation of weakly acceptable collection ofconstraints (see Sections 11).

Methods for checking and ensuring that the collection of constraints isadmissible (see 12).

Methods for checking and ensuring that the constraints are compatiblewith the measure in the extension scheme based on CSI (see Section 13).

Optimal method to adjust the values of the measure to guaranteecompatibility (see Section 13).

Provide a preprocessing method/algorithm so that the extension based onCSI can be computed more efficiently (see Section 14).

A method of representing/processing/extracting knowledge into/over/froma knowledge-based system, is provided as follows:

-   -   (i) A knowledge base containing objects where each object is a        rule or logical sentence (knowledge) associated with a set of        evidences that supports the knowledge; Existing knowledge base,        that involves uncertainties and/or incompleteness, contains        mostly rules/knowledge, and a number representing the strength        of the rule. Although evidences may be used in determining the        number, these evidences are not explicitly represented in the        knowledge base nor are they used in any way in the inference        scheme.    -   (ii) A set of relationships among the bodies of evidences; Since        existing knowledge bases used numbers directly, rather than        bodies of evidences, they are incapable of dealing with the        relationships among the bodies of evidences. This is a very        significant factor in making the knowledge base, presented in        embodiments of this invention, more powerful, robust, accurate        and/or capable of more refined characterizations.    -   (iii) A mapping which associates each body of evidences with a        value. In addition, each piece of knowledge may be associated        with multiple bodies of evidences. Moreover, same bodies of        evidences and/or some combinations of these bodies of evidences        may be associated with any knowledge. This allows the knowledge        base to be created by multiple experts and/or from multiple        sources.    -   (iv) Methods for performing deductive and inductive inferences.        Existing knowledge bases can only perform deductive inferences.        However, since inductive inferences, in a knowledge base        according to embodiments of this invention, can be viewed as        dual (for example, inductive inference can be represented as the        dual of inductive inference, i.e., instead of the original        knowledge base, the negation of the knowledge and the complement        of the associated bodies of evidences are used. Inductive        inference can be used to extract new knowledge from the given        knowledge base), inductive inferences can also be performed        using our knowledge base. Inductive inference can be used to        extract new knowledge from the original knowledge base, in other        words, it is capable of doing data mining, etc.

According to an embodiment of the present invention, the describedfunctions are implemented as a computer online service providing adatabase of knowledge according to embodiments of the present inventionfor representation of knowledge, knowledge acquisition mechanisms andinference mechanisms to output results or reports. According to anembodiment, a target knowledge is expressly associated with a setevidences that supports the target knowledge. The set of evidences canbe dynamically maintained, for example, updated. The relationship ofevidences with the target knowledge can be based upon input uponacquisition of the target knowledge. According to an embodiment,relationships among evidences (constraints) in a set of evidences aremaintained. Target knowledge can be retrieved from data sources and/orcreated/generated. In addition, the evidences, including itsrelationships, in support of the target knowledge can be retrieved fromdata sources and/or created/generated. A computer implemented userinterface can be provided to support input of target knowledge, itsassociated evidences, relationships among the evidences. Semantic datasource based upon a domain ontology including value of evidences insupport of the knowledge can be utilized.

For example, an expert (person and/or machine) can create knowledgeand/or evidences, including its relationships, in support of theknowledge. According to an embodiment, a blog, social media chat orinformation may be a source of knowledge and/or evidences, including itsrelationships.

According to an embodiment, inference mechanism (reasoning) is basedupon a value indicative of strength of the evidence. Deductive inferenceinvolves determining a set of evidences and its validity. Inductiveinference involves determining the set of evidences that can induce newknowledge or extract (data mining) new knowledge, by utilizing theconstraints of the evidences in the set of evidences supporting thetarget knowledge. According to the embodiments described methods areprovided for checking and ensuring that the constraints are admissibleand/or compatible, as described for example in paragraphs 233, 244 andFIGS. 12-13 , an optimal method is provided to adjust the values of themeasure to guarantee compatibility in case it is not compatible, and apreprocessing method/algorithm is provided so that the extension can becomputed more efficiently.

Evidence strength indications may be derived and/or input, for example,based upon evaluations, rating. According to an embodiment, the evidencevalues are derived based upon probabilities.

FIG. 16 is a functional block diagram of a processing device, such as acomputer (hardware computing/processing machine) for the embodiments ofthe invention, namely a computer 120 configured to execute functions ofthe augmented knowledge base computer system 100. In FIG. 16 , thecomputer can be any computing device that can execute instructions toprovide the described functions. Typically, the computer includes aninput device 1514 (for example, a mouse, keyboard, multi-touch displayscreen, etc.), output device 1502 (for example, a display to display auser interface or output information, printer, etc). One or morecomputer controller(s) or processing cores 1504 (e.g., a hardwarecentral processing unit) executes instructions (e.g., a computer programor software) that control the apparatus to perform operations. Accordingto an aspect of an embodiment, one or more networked computer servers,each with a number of processing cores, execute the describedoperations.

Typically, a memory component 1506 stores the instructions for executionby the controller 1504. According to an aspect of an embodiment, theapparatus reads/writes/processes data of any computer readable recordingor storage media 1510 and/or communication transmission media interface1512. The communication transmission media interface is to data networkwith one or other machines (e.g., computers, a distributed network) toexecute the described functions. The embodiments can be implemented viagrid computing. The display 1502, the CPU 1504 (e.g., hardware logiccircuitry based computer processor that processes instructions, namelysoftware), the memory 1506, the computer readable media 1510, and thecommunication transmission media interface 1512 are in communication byone or more the data bus(es) 1508.

According to an aspect of the embodiments of the invention, anycombinations of one or more of the described features, functions,operations, and/or benefits can be provided. A combination can be one ora plurality. The embodiments can be implemented as an apparatus (amachine) that includes hardware for performing the described features,functions, operations, and/or benefits, for example, hardware to executeinstructions or software, for example, computing hardware (i.e.,computing apparatus), such as (in a non-limiting example) any computeror computer processor that can store, receive, retrieve, process and/oroutput data and/or communicate (network) with other computers. Accordingto an aspect of an embodiment, the described features, functions,operations, and/or benefits can be implemented by and/or use computinghardware and/or software. For example, the computer 120 and expert/andother sources devices 110, . . . , etc. can comprise a computingcontroller (CPU) (e.g., a hardware logic circuitry based computerprocessor that processes or executes instructions, namelysoftware/program), computer readable media, transmission communicationinterface (network interface), input device, and/or an output device,for example, a display device, and which can be in communication amongeach other through one or more data communication buses. In addition, anapparatus can include one or more apparatuses in computer networkcommunication with each other or other devices. In addition, a computerprocessor can refer to one or more computer processors in one or moreapparatuses or any combinations of one or more computer processorsand/or apparatuses. An aspect of an embodiment relates to causing and/orconfiguring one or more apparatuses and/or computer processors toexecute the described operations. The results produced can be output toan output device, for example, displayed on the display. An apparatus ordevice refers to a physical machine that performs operations, forexample, a computer (physical computing hardware or machinery) thatimplement or execute instructions, for example, by way of software,which is code executed by computing hardware, and/or by way of computinghardware (e.g., in circuitry, etc.), to achieve the functions oroperations being described. The functions of embodiments described canbe implemented in any type of apparatus that can execute instructions orcode. More particularly, programming or configuring or causing anapparatus or device, for example, a computer, to execute the describedfunctions of embodiments of the invention creates a new machine where incase of a computer a general purpose computer in effect becomes aspecial purpose computer once it is programmed or configured or causedto perform particular functions of the embodiments of the inventionpursuant to instructions from program software. According to an aspectof an embodiment, configuring an apparatus, device, computer processor,refers to such apparatus, device or computer processor programmed orcontrolled by software to execute the described functions.

A program/software implementing the embodiments may be recorded on acomputer-readable media, e.g., a non-transitory or persistentcomputer-readable medium. Examples of the non-transitorycomputer-readable media include a magnetic recording apparatus, anoptical disk, a magneto-optical disk, and/or volatile and/ornon-volatile semiconductor memory (for example, RAM, ROM, etc.).Examples of the magnetic recording apparatus include a hard disk device(HDD), a flexible disk (FD), and a magnetic tape (MT). Examples of theoptical disk include a DVD (Digital Versatile Disc), DVD-ROM, DVD-RAM(DVD-Random Access Memory), BD (Blue-ray Disk), a CD-ROM (CompactDisc-Read Only Memory), and a CD-R (Recordable)/RW. The program/softwareimplementing the embodiments may be transmitted over a transmissioncommunication path, e.g., a wire and/or a wireless network implementedvia hardware. An example of communication media via which theprogram/software may be sent includes, for example, a carrier-wavesignal.

The many features and advantages of the embodiments are apparent fromthe detailed specification and, thus, it is intended by the appendedclaims to cover all such features and advantages of the embodiments thatfall within the true spirit and scope thereof. Further, since numerousmodifications and changes will readily occur to those skilled in theart, it is not desired to limit the inventive embodiments to the exactconstruction and operation illustrated and described, and accordinglyall suitable modifications and equivalents may be resorted to, fallingwithin the scope thereof.

What is claimed is:
 1. A server configured to implement a service tooutput a result in response to a query, input through a user interface,of a target rule L as a target knowledge to be associated with a set ofevidences E which is a subset of set U of evidences in an electronicallystored knowledge base (KB) of the set U based on a collection ofknowledge in form of rules A in form of a relational database and/or adeductive database in support of the target rule L, the servercomprising: a processor coupled to a memory storing instructions whichwhen executed by the processor causes the processor to execute a processto, obtain mapping information which associates evidences in the set Uof evidences with respective weight values representing a correspondingstrength characteristic of an evidence among the evidences; transformdata from the KB to a Free-Form Database (FFDB); create, based on theFFDB, data objects in form E→A for data in the KB, where A is as a rule,among rules in the KB, and E is a subset of evidences from the set U ofevidences to support the rule A, and an evidence among the evidences inE indicatable as uncertain and/or incomplete based on the correspondingstrength characteristic of the evidence, according to a mapped weightvalue among mapped weight values in the mapping information, so that therule A is supportable by the set U of evidences E according to themapped weight values; compute relationship constraints

_(κ) based on the respective weight values indicating validity and/orplausibility values, the relationship constraints

_(κ) indicating relations among a plurality of sets of evidences E,including at least one or any combination of a subset relation or adisjoint relation among the plurality of sets of evidences E; computefor the target rule L, a first composite object ω₁=(G→L₀), the firstcomposite object wi constructed from a combination of the data objectsE→A to create a composite set of evidences G subject to the relationshipconstraints

_(κ) which support a composite rule L₀, the composite rule L₀ being acombination of the rules implying the target rule L, the first compositeobject ω₁ being indicative, based upon the computed relationshipconstraints

_(κ), of deductive reasoning of a first validity and/or a firstplausibility value for the target rule L; and output the resultrepresenting an extracted knowledge from the KB in form of the queriedtarget rule L associated with the E according to the first compositeobject ω₁.
 2. The server according to claim 1, wherein the process is toadd the target rule L to the FFDB to form an augmented knowledge base(AKB) based on the target rule L.
 3. The server according to claim 2,wherein the process is to further: create a largest composite set ofevidences G₁ according to a union of a plurality of sets of evidences E,subject to the relationship constraints

_(κ) which support the composite rule L₀; compute, based on the G₁, asecond composite object ω₂ indicative by inductive reasoning of a secondvalidity value for target rule L; and output the result representing anextracted knowledge from the KB in form of the target rule L associatedwith the E according to the second composite object ω₂, and add thetarget rule L to the FFDB to thereby augment the FFDB with the targetrule L in form of the AKB.
 4. The server according to claim 3, whereinthe process is to further: obtain a set complement of the largestcomposite set of evidences G₁ that supports negation of the target ruleL; compute, based on the set complement of the largest composite set ofevidences G₁ a third composite object ω₃ indicative by inductivereasoning of a second plausibility value for the target rule L; andoutput the result representing an extracted knowledge from the KB inform of the target rule L associated with the E according to the thirdcomposite object ω₃, and add the target rule L to the FFDB to therebyaugment the FFDB with the target rule L in form of the AKB.
 5. Theserver according to claim 3, wherein the process is to further: add anew object in the form U→L′, where the new object is a negated targetrule L′ supported by the U, to the KB, to thereby provide an expandedKB; the largest composite set of evidences G₁ is created from theexpanded KB that supports a logical falsehood F created on basis of thenegated target rule L′, subject to the relationship constraints

_(κ) which support the composite rule L_(o) , to thereby compute thesecond validity value for the target rule L.
 6. The server according toclaim 4, wherein the process is to further: add a new object in the formU→L, where the new object is the target rule L supported by the U, tothe KB, to thereby provide an expanded KB; the largest composite set ofevidences G₁ is created from the expanded KB that supports a logicaltruth T created on basis of the target rule L, subject to therelationship constraints

_(κ) which support the composite rule L₀, to thereby compute the secondplausibility value for the target rule L.
 7. The server according toclaim 1, wherein the process is to further: determine a complement ofthe first composite object ω₁′=(G′→L₀′), the complement of the firstcomposite object ω₁′=(G′→L₀′) being a complement of the composite set ofevidences G′ and a negation of the composite rule L₀′, subject to therelationship constraints

; and compute, based on the complement of the first composite objectω₁′, for the target rule L, a fourth composite object ω₄ indicative byinductive reasoning of a third validity and/or a third plausibilityvalue for the target rule L.
 8. The server according to claim 3, whereinto create the largest composite set of evidences G₁ according to theunion of the plurality of sets of evidences E, the process is tofurther: repeatedly combine two or more sets of evidences E among theplurality of sets of evidences E by merging rules A and takingintersection of the two or more sets of evidences E, subject to therelationship constraints

_(κ) among the two or more sets of evidences E.
 9. The server accordingto claim 1, wherein the process is to further: construct a consistenthigher order KB according to different methods of removing and/oraltering objects E→A in the KB that are inconsistent with other objectsE→A, wherein the data objects in the form E→A are created for theconsistent higher order KB.
 10. The server according to claim 1, whereinthe process is to further: for the mapping information, extend a measureof the composite sets of evidences G, where a measure is a functionwhich maps the respective weight values to the set of evidences E, by:selecting a measure defined on the KB which is probabilistic to filteranomalies associated with the deductive reasoning; representing therelationship constraints

_(κ) among the plurality of sets of evidences E in data nodesrepresented in form of directed and undirected graphs; insuring that agraph representation of the relationship of constraints

_(κ) is admissible and compatible with the measure; preprocessing themeasure subject to the relationship constraints

_(κ), to obtain a new measure free of any relationship constraints; andapplying the new measure to the set of evidences E.
 11. The serveraccording to claim 1, wherein the relationship constraints

_(κ), are input.
 12. The server according to claim 1, wherein the firstvalidity and/or the first plausibility value is indicative of a strengthof validity and/or plausibility for the target rule L.
 13. The serveraccording to claim 1, wherein to transform the data from the KB to theFFDB includes a conversion of the data from the KB to a plurality ofFFDs as a plurality of knowledge bases; and determining an equivalencebetween two FFDs among the plurality of FFDs based on respectivecomputed first composite objects wi in the two FFDs indicating a firstFFD of the two FFDs being a subset of a second FFD of the two FFDs. 14.The server according to claim 13, wherein the equivalence includes thefirst FFD being derivable from the second FFD and/or the second FFDbeing derivable from the first FFD.
 15. A server, comprising: anon-transitory computer readable storage medium to store: a Free-FormDatabase as a knowledge base (KB) resulting from a conversion of datawhich is indicated as uncertain according to a certainty factor and/orincomplete from sources inclusive of relational databases, deductivedatabases, and/or inductive databases, mapping information, whichassociates information as evidences in a set U of evidences in the KBwith respective weight values representing a strength characteristic ofan evidence among the evidences in the KB, the strength characteristicindicative of a certainty factor and/or incompleteness, and objects inform of E→A, where A is a rule among rules in the KB, and E is a subsetof evidences E in the sets of evidences U from the knowledge base thatsupports the rule A, so that the rule A is supportable by the subset ofevidences E; and at least one hardware processor coupled to at least onememory to execute instructions stored in the at least one memory, whichinstructions when executed by the at least one hardware processor,control the server to, augment the data of the KB resulting from acomposite object w of data which is indicated as uncertain and/orincomplete in form of an AKB, the composite object ω resulting from,obtaining relationship constraints

_(κ) in form of set relations among a plurality of subsets of evidencesE; and computing for a target rule L, a composite object with respect tovalidity (v), ω₁ _(p) =G₁→L₀, and/or with respect to plausibility (p),ω₁ _(p) =G₁→L₀, the composite object ω₁ _(p) or ω₁ _(p) constructed froma combination of the objects E→A to create a composite set of evidencesG₁ _(p) or G₁ _(p) subject to the relationship constraints

_(κ) which support a composite rule L₀ _(v) or L₀ _(p) , the compositerule L₀ _(v) or L₀ _(p) being a combination of the rules implying thetarget rule L according to deductive reasoning indicated by mappedweight values in the mapping information to the sets of evidences U, thecomposite object ω₁ _(p) or ω₁ _(p) representing derived new knowledgebased upon the relationship constraints

_(κ) as indicative according to the deductive reasoning based upon avalidity value (v) or a plausibility value (p) for the target rule L.16. At least one server, comprising: at least one processor; and atleast one non-transitory computer readable storage medium configured tostore, a Free-Form Database as a knowledge base (KB) resulting from aconversion of data which is indicated as uncertain according to acertainty factor and/or incomplete from sources inclusive of relationaldatabases, deductive databases, and/or inductive databases; mappinginformation, which associates information as evidences in the KB withrespective weight values representing a strength characteristic of anevidence among the evidences in the KB, the strength characteristicindicative of a certainty factor and/or incompleteness; objects in formof E→A, where A is a rule among rules in the KB, and E is a subset ofevidences E in a set U of evidences from the KB that supports the ruleA, so that the rule A is supportable by the subset of evidences E; andinstructions which when executed by the at least one processor cause theat least one server to execute a process to, augment the data of the KBin form of an augmented knowledge base (AKB) resulting from at least onecomposite object w to provide the data which is indicated as uncertainand/or incomplete in form of the AKB, the at least one composite objectw resulting from, obtaining relationship constraints

_(κ) in form of set relations among a plurality of subsets of evidencesE; and computing for a target rule L, a composite object with respect tovalidity (v), ω₁ _(v) =G₁→L₀ , and/or with respect to plausibility (p),ω₁ _(p) =G₁→L₀, the composite object ω₁ _(p) or ω₁ _(p) constructed froma combination of the objects E→A to create a composite set of evidencesG₁ _(p) or G₁ _(p) subject to the relationship constraints

_(κ) which support a composite rule L₀ _(v) or L₀ _(p) , the compositerule L₀ _(v) or L₀ _(p) being a combination of the rules implying thetarget rule L according to deductive reasoning indicated by mappedweight values in the mapping information to the sets of evidences U, thecomposite object ω₁ _(p) or ω₁ _(p) indicative according to thedeductive reasoning of a first validity value or a first plausibilityvalue for the target rule L; and/or computing for at least one targetrule L, a composite object with respect to i-validity (v), ω₁ _(iv)=G₁→L₀, and/or with respect to i-plausibility (p), ω₁ _(ip) =G₁→L₀, thecomposite object ω₁ _(iv) or ω₁ _(ip) constructed from a combination ofthe objects E→A to create a composite sets of evidences G₁ _(iv) or G₁_(ip) subject to the relationship constraints

_(κ) which support a composite rule L₀ _(iv) or L₀ _(ip) , the compositerule L₀ _(iv) or L₀ _(ip) being a combination of the rules implied bythe target rule L according to inductive reasoning indicated by themapped weight values in the mapping information to the sets of evidencesU, the composite object ω₁ _(iv) or ω₁ _(ip) indicative according to theinductive reasoning of a first i-validity value or a firsti-plausibility value for the target rule L.
 17. The at least one serveraccording to claim 16, wherein: the mapped weight values to the sets ofevidences U is according to at least one measure, the at least onemeasure is at least one function or at least one mapping indicative of astrength of the first i-validity value or the first i-plausibility valuefor the target rule L with respect to inductive reasoning; and thecomputed composite object with respect to validity ω₁ _(iv) includescomputing the first i-validity value by determining a value of the G₁_(iv) using mapped weight values in the mapping information of theplurality of subsets of evidences E, the first i-validity valueindicative of the strength of i-validity for the target rule L accordingto the measure, or the computed composite object with respect tovalidity ω₁ _(ip) includes computing the first i-plausibility value bydetermining a value of the G₁ _(ip) using mapped weight values in themapping information of the plurality of subsets of evidences E, thefirst i-plausibility value indicative of the strength of i-plausibilityfor the target rule L according to the at least one measure.
 18. The atleast one server according to claim 16, wherein the instructions furthercause the at least one processor to: create a new knowledge base bytransforming, for each object ω=E→L in the knowledge base, L intoconjunctive normal form and include in the new knowledge base eachE→L_(i) where L_(i) is a disjunction in L.
 19. The at least one serveraccording to claim 16, wherein the at least one target rule L includesrules L₁ and L₂, and the instructions further cause the at least oneprocessor to: estimate the validity or plausibility of L according todeductive reasoning by repeated applications of decomposition rulesincluding: validity of a disjunction L₁∨L₂ being a subset of union ofvalidity of L₁ and validity of L₂; validity of a conjunction L₁∧L₂ beingan intersection of validity of L₁ and validity of L₂; a plausibility ofa disjunction L₁∨L₂ being a union of plausibility of L₁ and plausibilityof L₂; and a plausibility of a conjunction L₁∧L₂ being a subset of anintersection of plausibility of L₁ and plausibility of L₂.
 20. The atleast one server according to claim 16, wherein the at least one targetrule L includes rules L₁ and L₂, and the instructions further cause theat least one processor to: determine, in response to the knowledge basebeing consistent, the validity or plausibility of L according todeductive reasoning by repeated applications of decomposition rulesincluding: validity of a disjunction L₁∨L₂ being union of validity of L₁and validity of L₂; validity of a conjunction L₁∧L₂ being intersectionof validity of L₁ and validity of L₂; plausibility of a disjunctionL₁∨L₂ being a union of plausibility of L₁ and plausibility of L₂; andplausibility of a conjunction L₁∧L₂ being an intersection ofplausibility of L₁ and plausibility of L₂.